Skip to main content
Cell Reports Methods logoLink to Cell Reports Methods
. 2024 Dec 2;4(12):100910. doi: 10.1016/j.crmeth.2024.100910

Enhancing immuno-oncology investigations through multidimensional decoding of tumor microenvironment with IOBR 2.0

Dongqiang Zeng 1,2,3,12, Yiran Fang 3,12, Wenjun Qiu 3, Peng Luo 4, Shixiang Wang 5, Rongfang Shen 6, Wenchao Gu 7, Xiatong Huang 3, Qianqian Mao 3, Gaofeng Wang 8,9, Yonghong Lai 3, Guangda Rong 3, Xi Xu 10, Min Shi 3, Zuqiang Wu 1,2,, Guangchuang Yu 11,∗∗, Wangjun Liao 1,2,3,13,∗∗∗
PMCID: PMC11704618  PMID: 39626665

Summary

The use of large transcriptome datasets has greatly improved our understanding of the tumor microenvironment (TME) and helped develop precise immunotherapies. The growing application of multi-omics, single-cell RNA sequencing (scRNA-seq), and spatial transcriptome sequencing has led to many new insights, yet these findings still require clinical validation in large cohorts. To advance multi-omics integration in TME research, we have upgraded the Immuno-Oncology Biological Research (IOBR) package to IOBR 2.0, restructuring and standardizing its analytical workflow. IOBR 2.0 offers six modules for TME analysis based on multi-omics data, including data preprocessing, TME estimation, TME infiltration pattern identification, cellular interaction analysis, genome and TME interaction, and feature visualization, as well as modeling. Additionally, IOBR 2.0 enables constructing gene signatures and reference matrices from scRNA-seq data for TME deconvolution. The user-friendly pipeline provides comprehensive insights into tumor-immune interactions, and a detailed GitBook(https://iobr.github.io/book/) offers a complete manual and analysis guide for each module.

Keywords: tumor microenvironment, gene signatures, multi-omics, tumor-immune interaction, single-cell data, tumor-metabolism, immunotherapy

Graphical abstract

graphic file with name fx1.jpg

Highlights

  • IOBR 2.0 offers a pipeline for TME analysis and biomarker identification

  • Includes six modules for RNA data preprocessing, TME profiling, genome-TME interactions

  • Integrates 10 deconvolution methods and 322 TME-related gene signatures

  • Supports phenotypic analysis in bulk RNA-seq data using single-cell features

Motivation

The growing use of immunotherapy has highlighted the critical role of the TME in influencing treatment outcomes. Advances in sequencing technologies have enabled researchers to analyze the TME from multiple perspectives, leading to new insights. However, the complexity and volume of multi-omics data present challenges for analysis and interpretation. To address these, we present IOBR 2.0, an upgraded version of IOBR 1.0 that works as an integrated tool that simplifies TME analysis and visualization using multi-omics data. IOBR 2.0 enables researchers to systematically investigate TME characteristics and identify biomarkers for immunotherapy outcomes.


Zeng et al. present IOBR 2.0, a comprehensive and user-friendly toolkit for TME profiling and biomarker discovery in multi-omics studies, updated from IOBR 1.0. Featuring six integrated analysis modules, IOBR 2.0 facilitates the exploration of TME patterns, genome-TME interactions, and single-cell characteristics within bulk RNA-seq data, offering comprehensive visualization functions and enhancing our understanding of tumor-immune interactions.

Introduction

Over the past decade, cancer treatment has undergone revolutionary changes with the advent of cancer immunotherapies and checkpoint inhibitors, reshaping the landscape of cancer therapy. However, clinical responses to immunotherapy vary significantly among patients: some experience long-lasting benefits, while others see no clinical improvement or even tumor progression.1,2 Thus, understanding the diversity of the tumor microenvironment (TME) is crucial to expanding the efficacy of immunotherapy. The TME is a highly structured ecosystem that includes a rich diversity of immune cells, cancer-associated fibroblasts, endothelial cells, pericytes, and other cell types.3,4 Numerous studies have demonstrated that the TME plays a critical role in tumor progression and invasion.5,6 Factors such as tumor type, intrinsic tumor characteristics, tumor stage, and patient status can influence the cellular composition and functional state of the TME. These variations necessitate high-resolution profiling technologies to explore the interactions between cancer-intrinsic characteristics and the TME, laying the foundation for designing precise targeted therapy strategies.7

Currently, studies of the TME using gene expression patterns from large bulk transcriptome datasets have advanced our understanding and identification of the interactions within TME and aided in the development of more precise immunotherapy treatments for tumors. This progress includes the utilization of gene signature scores, such as T cell-inflamed gene expression profile (GEP),8 pan-fibroblast transforming growth factor β (TGF-β) response signature (Pan-F-TBRS),9 Tumor Immune Dysfunction and Exclusion (TIDE),10 and TMEscore.11 GEP evaluates the activation of the T cell-inflamed TME, predicting the prognosis of patients with immunotherapy treatment. Pan-F-TBRS assesses TGF-β signaling in fibroblasts, determining the presence of an immune-excluded phenotype. TIDE identifies tumor immune dysfunction and exclusion, predicting patient resistance to immunotherapy. These signature scores provide multifaceted insights into the TME, guiding clinical treatment decisions. Years of research have shown that transcriptome gene expression signatures can adeptly characterize the TME and exhibit significant potential for clinical translation.12,13 IOBR (Immune-Oncology Biological Research) 1.0 debuted in 2021 to describe the systematic approach to TME profile and correlation.14 This tool has enabled numerous studies to come to fruition over the last few years.15,16 Simultaneously, we have continuously enhanced and updated IOBR with feedback from our users. The recent surge in single-cell RNA sequencing (scRNA-seq) has enabled us to identify novel microenvironmental cells, TME characteristics, and tumor clonal signatures with higher accuracy.17 It is necessary to validate and characterize these features attained from high-dimensional single-cell information in bulk sequencing with extended specimen sizes for clinical phenotyping.

IOBR 1.0 provided a suite of highly effective functions for microenvironmental analysis and integrated eight key microenvironmental analysis algorithms into the framework, including CIBERSORT,18 EPIC,19 quanTIseq,20 and others.21,22,23,24,25 This integration makes it simple for users to conduct analyses and visualize data using the IOBR 1.0 process. However, the identification of new cells and functions has posed several challenges to users attempting to customize parsing with newly acquired reference data. The advancement in AI and machine learning is driving researchers to focus on identifying patterns within TME and exploring the clinical significance of microenvironmental features.26 In addition, screening important features, evaluating feature robustness, and constructing models have become pressing concerns.

To tackle these challenges, we upgraded IOBR 1.0 to IOBR 2.0, incorporating additional algorithms and redesigning its workflow to create a multidimensional analysis and visualization procedure focused on TME data. We provide detailed data preparation workflows, have added more TME-related signatures, and have enabled users to customize features for TME analysis, including the use of scRNA-seq data to extract cell features. In addition to integrating conventional analysis and deep mining methods, we have expanded our functionalities to include TME interaction analysis, providing a comprehensive one-stop solution for analysis and visualization in transcriptome projects. Overall, we have structured the workflow into six modules: data quality control and processing, profiling the TME, exploring interactions within the TME, examining interactions between the TME and the genome, visualizing the TME, and feature screening and model construction based on key features. By establishing systematic modules for TME analysis, we can successfully conduct multi-dimensional analyses of the TME. Additionally, we have created a comprehensive GitBook with a user-friendly manual to assist researchers in analyzing the TME using IOBR 2.0 (https://iobr.github.io/book/). This system is ideal for large-scale research in multi-omics related to the TME, guiding the discovery of novel biomarkers and advancing precision medicine.

Results

Overview of the IOBR 2.0 workflow

Based on the prior framework, we have enhanced the analytical and visualization capabilities in IOBR 2.0. We have structured the IOBR 2.0 workflow (Figures 1 and 2) into six functional modules: (1) transcriptome data preparation module; (2) TME deconvolution and signature estimation module; (3) TME interaction module; (4) genome and TME interaction module; (5) TME data visualization and statistical analysis module; and (6) TME modeling module. The schematic workflow and functional code are depicted in Figures 1 and 2, respectively. Corresponding figures were dynamically generated following inputting function-specific parameters of pertinent modules. Details of these six modules are illustrated in the STAR Methods. Charts generated by IOBR 2.0 meet publication-quality standards and can be flexibly modified locally. We have prepared several example datasets from published studies within the IOBR 2.0 package to help users quickly familiarize themselves with its functionalities (see STAR Methods). The workflow and functionalities of IOBR 2.0 are further illustrated below using The Cancer Genome Atlas Urothelial Bladder Carcinoma (TCGA-BLCA) count data (n = 430) and IMvigor210 data (n = 348).9

Figure 1.

Figure 1

The graphical scheme describing the workflow of the IOBR 2.0

IOBR 2.0 encompasses transcriptome data preparation, multiple deconvolution algorithms and signature estimation methods for microenvironment analysis, TME pattern identification, analysis of interactions between the genome and TME, batch visualization and statistical analysis, and TME modeling. TME, tumor microenvironment; MAF, mutation annotation format; PCA, principal-component analysis; GSEA, gene set enrichment analysis; RF, random forest; ML, machine learning; AUROC, area under the receiver operating characteristic curve.

Figure 2.

Figure 2

IOBR 2.0 is composed of six analytic modules related to data preprocessing and tumor immune microenvironment

The functionalities of these modules include (1) preprocessing of transcriptome data; (2) estimation of signature scores and identification of phenotype-relevant or user-constructed signatures, along with decoding the TME contexture; (3) identification of TME patterns and analysis of ligand-receptor interactions; (4) estimation of the specific mutation landscape associated with the signature of interest; (5) corresponding batch visualization and statistical analyses; and (6) model construction. WES, whole-exome sequencing.

Preprocessing transcriptome data for downstream analysis

Preprocessing sequencing data is an essential step before beginning analysis, as it ensures the comparability of transcriptome data originating from various sequencing technologies or batches. After RNA-seq read alignment and gene quantification, preprocessing is required to ensure data consistency across different datasets.

The transcriptomic and corresponding clinical data of IMvigor210 were downloaded from the IMvigor210CoreBiologies R package.9 In the transcriptome data preparation module, the count2tpm function efficiently converts IMvigor210’s RNA-seq count data into transcripts per million (TPM) values. During normalization, gene expression matrices are deduplicated and annotated concurrently (Figure 3A). Depending on research needs, the module offers three methods for handling duplicate gene symbols: mean, standard deviation, and sum. Additionally, the module can convert RNA-seq gene identifiers (Ensembl Gene IDs, Entrez Gene IDs) and microarray probe IDs into gene symbols. Notably, the module can also annotate RNA-seq expression matrices for mouse genes.

Figure 3.

Figure 3

Preprocessing of transcriptomic data and calculation of TME cell infiltration abundance

(A) Flowchart illustrating the conversion of RNA-seq count data to TPM using the count2tpm function.

(B) PCA scatterplot depicting the distribution of tissue types in the IMvigor210 dataset.

(C) Comparison of data distribution before and after batch correction for IMvigor210 and TCGA-BLCA datasets.

(D and E) Percentage bar plots displaying TME cell percentages based on CIBERSORT (D) and quanTIseq (E).

(F) IOBR 2.0 utilizes cell-type-specific gene expression signatures generated from single-cell analysis to decode the TME landscape of the IMvigor210 cohort from bulk RNA-seq data. Each color represents a different cell type. NA, not available; TPM, transcripts per million.

The presence of outlier samples can affect the reliability of downstream transcriptome analyses. To mitigate this impact, the find_outlier_sample function was employed to remove outliers from the annotated data.27 Next, using the iobr_pca function, we identified potential outlier samples in the IMvigor210 cohort data and visualized the similarities and differences between samples, aiding in a better understanding of data structure and patterns (Figure 3B).

In multi-cohort or multi-batch analyses, in addition to removing outlier samples, correcting batch effects is crucial. Batch effects arise from systematic biases in experimental batches and can influence downstream cohort-level analyses. In the transcriptome data preparation module, the remove_batcheffect function, built on ComBat,28 identifies and visualizes potential batch effects within the cohort. Before merging TCGA-BLCA (n = 430) and IMvigor210 data, we used this function to check for batch effects between the two cohorts (Figure 3C). If batch effects are present, then this function eliminates them across different cohorts or datasets before standardizing count data. The function can visualize corrected data for comparison before and after adjustment. Removing outliers and mitigating batch effects in GEPs is essential for downstream analysis.

In summary, the transcriptome data preparation module simplifies the preprocessing of transcriptomic data and visually presents the data processing steps, aiding in the understanding of GEP structure and patterns.

Unraveling TME profiles and gene signatures associated with treatment outcomes

Immunotherapy, particularly immune checkpoint blockade, is increasingly used in the treatment of advanced cancers. However, the heterogeneity in treatment responses has spurred research to identify molecular features associated with tumor immunity and immunotherapy responses. These features include measures of tumor immune infiltration and gene expression signatures. Estimating cellular components and calculating signature scores are crucial for accurately classifying TME phenotypes and uncovering tumor immune evasion mechanisms.29 In this context, the TME deconvolution and signature estimation module of IOBR 2.0 helps unravel the TME landscape within the growing body of bulk RNA-seq data.

Currently, two main methods are used for estimating immune cell infiltration in the TME: deconvolution-based and marker-based approaches.30 Deconvolution-based methods, such as TIMER22 and CIBERSORT,18 employ mathematical models to extract signals of specific cell types from complex gene expression data. The key to these methods is the use of predefined gene expression signatures to represent different immune cell types. Marker-based methods, such as xCell25 and MCPcounter,24 assess the expression levels of specific markers to calculate an enrichment score, reflecting the relative abundance and activity of each cell type based on these marker genes. The TME deconvolution and signature estimation module integrates 10 methods, including those mentioned, through the deconvo_tme function, which quickly evaluate immune cell infiltration in the TME (Figures 3D and 3E). Because each algorithm uses different gene features, the coverage of cell types varies, leading to differing assessments of immune and stromal cell abundance. We compared the results of the deconvo_tme function with those of different deconvolution algorithm source codes (Figure S1). Users can choose one or more methods with deconvo_tme to evaluate the consistency of results across different algorithms, simplifying the analysis process and providing a more comprehensive evaluation of the TME.

Another important function in the TME deconvolution and signature estimation module is the rapid calculation of signature scores. The calculate_sig_score function integrates three calculation methods: single-sample gene set enrichment analysis (ssGSEA), principal-component analysis (PCA), and Z score (see STAR Methods). By inputting transcriptomic data processed in the data preprocessing module, the calculate_sig_score function can batch calculate the corresponding signature scores for each sample. Additionally, IOBR 2.0 has expanded the number of built-in gene signatures (n = 322), including those related to the TME, metabolism, gene signatures derived from scRNA-seq data, and others (Table S1). The module also includes gene sets from Gene Ontology,31 Kyoto Encyclopedia of Genes and Genomes,32 HALLMARK,33 and REACTOME databases,34 allowing researchers to assess patients’ GEPs from various perspectives (Figure 1). By inputting TPM data, we ultimately obtained the feature matrix for the IMvigor210 cohort, which will be used for subsequent analyses.

With the advancement and widespread adoption of sequencing technology, an increasing number of gene features can be explored to understand the impact of the TME on immunotherapy. To address the limitation of IOBR 2.0’s signature updates, the module provides two functions, format_signatures and format_msigdb. These functions allow users to create customized gene signature lists for calculate_sig_score function, enabling more flexible analysis of transcriptomic data based on their research objectives. Besides these methods, users can manually construct signature lists for analysis.

Consequently, IOBR 2.0 integrates a variety of deconvolution and scoring algorithms, streamlining the analysis of phenotypic characteristics and cell-type percentages. This facilitates a deeper understanding of gene expression patterns and cellular distributions and their roles in the context of tumor therapy.

Utilizing scRNA-seq-derived signatures to decode bulk sequencing data

With the rapid advancement of scRNA-seq, the number of sequenced cells has increased exponentially over the past decade. This advancement has enabled the discovery of previously unknown rare cell types, elucidation of cellular composition, characterization of cell interactions within tumor tissues, and construction of increasingly detailed single-cell atlases of tumors.35 To leverage the insights gained from single-cell analysis, IOBR 2.0 has updated its single-cell signature extraction function, allowing users to apply single-cell-derived signatures to examine tumor heterogeneity through bulk RNA-seq data.

Using a publicly available scRNA-seq dataset of peripheral blood mononuclear cells (PBMCs) from 10X Genomics, we extracted cell-type-specific gene expression signatures (see STAR Methods). By utilizing scRNA-seq data, IOBR 2.0 enables users to identify cell-type-specific gene expression signatures from cluster analyses reported in existing studies. When inputting a Seurat object, marker genes for each cell type are accurately identified through differential expression analysis using the generateRef_seurat function within IOBR 2.0, which facilitates the generation of a reference matrix. Subsequently, based on prior knowledge of cell-type-specific gene expression signatures, we can estimate the relative abundance of different cell types in a bulk RNA-seq cohort utilizing the deconvo_tme function, which implements either the linear Support Vector Regression algorithm from CIBERSORT or the Least Squares Estimate of Imbalance algorithm (Figure 3F; see STAR Methods).18

Additionally, the get_sig_sc function was employed to acquire marker genes from single-cell differential analysis, serving as inputs for the calculate_sig_score function. This enables the calculation of corresponding signature scores. The integration of these functions allows researchers to combine the depth of scRNA-seq data with the breadth of bulk RNA-seq data, providing a more comprehensive tumor heterogeneity map and more precise cell feature analysis.

Through these functions, IOBR 2.0 not only enhances its capability to process single-cell data but it also expands its application range in bulk RNA-seq data analysis. This enables researchers to better understand the complexity of tumors and provides valuable insights for precision cancer treatment.

Identifying TME patterns and analyzing intercellular interactions

Based on the molecular characteristics and the infiltration profiles of TME, tumors in different patients exhibit distinct TME patterns. These patterns reveal differences in immune environments and tumor behaviors, which can significantly impact patients’ responses to immunotherapy. Studies have shown that tumors can be categorized as hot or cold based on immune cell infiltration.36 Hot tumors, characterized by substantial effector T cell infiltration, generally respond better to immunotherapy, while cold tumors, lacking sufficient immune cell infiltration, tend to respond poorly. Other research suggests that TME can be classified into three phenotypes: immune activated, immune desert, and immune excluded.37 Patients with different TME phenotypes exhibit significantly different outcomes in immunotherapy. Therefore, identifying TME patterns is crucial for describing the distribution and functional status of immune cells within TME, which is essential for developing tailored immunotherapy strategies.

To facilitate this, the IOBR 2.0 has updated the TME interaction module, assisting users in quickly analyzing and studying TME patterns. The tme_cluster function in this module is designed to profile the TME, providing a clear depiction of a patient’s immune landscape and tumor phenotype. After calculating the immune cell infiltration scores using the deconvo_tme function with the CIBERSORT algorithm, we used the tme_cluster function to identify potential TME patterns in the IMvigor210 cohort. Through unsupervised clustering, the function identified two distinct TME clusters (TME1 and TME2). These clusters exhibited distinct immune cell infiltration profiles: TME2 had higher infiltration of M1 macrophages and CD8+ T cells, while TME1 had higher infiltration of mast cells, dendritic cells, and regulatory T cells (Figure 4A). The results of sig_box function confirmed these findings (Figure 4C).

Figure 4.

Figure 4

IOBR 2.0 identifies TME patterns and analyzes differences in clinical features, phenotype, and TME components between TME clusters

(A) Heatmap showing (blue = TME1; red = TME2) TME-infiltrating cell signature score of each patient in high (red) and low (blue) score groups. Rows of the heatmap represent TME-infiltrating cell signature expression (Z scores) calculated using the CIBERSORT algorithm via the deconvo_tme function. TME1 and TME2 were identified as distinct TME patterns using the tme_cluster function based on the TME-infiltrating cell signatures of CIBERSORT.

(B) Kaplan-Meier curves comparing OS between TME1 and TME2.

(C) Boxplots showing the expression of M1 and CD8+ T cell (Z scores), mutation burden per megabase (MB) and neoantigen burden per MB between TME1 and TME2.

(D) Bar plot showing the clinical features and biomarkers, including immune phenotype, IC level, TC level, and BOR between TME1 and TME2.

(E and F) The boxplot (E) and heatmap (F) delineate the metabolism signatures enrolled in IOBR 2.0, and identify signatures associated with the TME pattern. p value in (E) was calculated using two-sided Mann-Whitney U test. ∗p < 0.05; ∗∗p < 0.01; ∗∗∗p < 0.001; ∗∗∗∗p < 0.0001; ns, not significant compared to isotype group.

IC level, immune cells level; TC level, tumor cells level; NA, not available; NE, not evaluable; CR, complete response; PR, partial response; SD, stable disease; PD, progressive disease; BOR, best of response.

Next, we evaluated programmed death ligand 1 (PD-L1) expression, tumor phenotypes, and treatment outcomes between the two TME clusters (tumor cells scored as percentage of PD-L1-expressing tumor cells: TC3 ≥50%, TC2 ≥5% and <50%, TC1 ≥1% and <5%, and TC0 <1%; tumor-inflitrating immune cells expressing PD-L1 scored as percentage of tumor area: IC3 ≥10%, IC2 ≥5% and <10%, IC1 ≥1% and <5%, and IC0 <1%). TME2 cluster showed higher immune cells level (IC level) and tumor cells level (TC level), more immune phenotypes, and a greater number of complete response/partial response patients (Figure 4D). This suggests that TME2 cluster might achieve better outcomes with immunotherapy. Survival analysis using the surv_group function indicated that patients of TME2 cluster had better survival outcomes than those of TME1 cluster, but the results were not statistically significant (Figure 5B). We also compared different signatures between TME1 and TME2 and found significant differences in metabolism-related signatures between the two clusters (Figures 4E, 4F, and S2A–S2D). This implies that while immune cell infiltration alone can distinguish different TME patterns, it may not be sufficient to guide clinical treatment decisions.

Figure 5.

Figure 5

Screening of immunotherapy prognostic features and construction of efficacy prediction models

(A) The forest plots reveal 12 gene signatures correlated with OS in IMvigor210 cohort.

(B) The correlation plot reflects the Spearman correlation between M1 macrophages and mutation burden per MB.

(C) Kaplan-Meier curves comparing OS among four TME clusters (TME1–4). IOBR 2.0 identifies these four TME clusters based on 30 features closely related to immunotherapy prognosis.

(D) Heatmap of immunotherapy prognosis-related signature scores derived from ssGSEA through IOBR 2.0 for the four TME clusters.

(E) Kaplan-Meier curves comparing OS between high- and low-score group. The riskscore cutoff is set at the mean value.

(F) The time-dependent ROC curves and AUC of riskscore (red), mutation burden per MB (gray), and the combined model (blue) predicting the clinical benefit for patients at 12 months. The combined model is constructed by integrating riskscore and mutation burden per MB using a Cox regression approach. ROC, receiver operating characteristic; ssGSEA, single-sample gene set enrichment analysis; OS, overall survival.

To delve deeper into the impact of intracellular signaling and intercellular interactions on TME characteristics, we have developed the lr_cal function of the TME interaction module, based on the EaSIeR package.38 This function calculates interaction weight for 813 ligand-receptor pairs, using either count or TPM data (Figures S3A and S3B), and it aids in synthesizing a comprehensive overview of the TME by integrating various types of prior knowledge.

This module provides a variety of analytical tools, enabling researchers to gain a comprehensive understanding of the complexity of the TME, identify different TME patterns, and assess their potential impact on immunotherapy responses.

Screening immunotherapy prognostic features and constructing efficacy prediction models

Based on input features, users can quickly identify TME patterns within a population using the TME interaction module. This module also enables the comparison of clinical characteristics, cell infiltration, and feature signatures among different clusters through visualization methods.39 However, not all TME patterns can stratify patients or guide precise tumor treatment. The advent of AI and machine learning algorithms offers novel approaches for comprehensive analysis of patients’ TME and immunotherapy predictions.26 Many studies have demonstrated the value of constructing models to predict patients’ responses to immunotherapy. By integrating the functionalities of different modules, IOBR 2.0 provides a standardized analytical workflow to identify features related to clinical prognosis and construct efficacy prediction models.

In previous analyses, we obtained various signature scores for patients. Our current objective is to identify prognostic features and construct efficacy prediction models to guide clinical decision-making. To ensure data quality and applicability, we preprocess the data using the feature_manipulation function. The TME data visualization and statistical analysis module offers functions for rapid biomarker screening and result visualization. Using this module, we conduct feature engineering based on patients’ survival responses and treatment outcomes. Through the batch_surv function, we perform batch survival analyses, calculating the hazard ratio for each feature to identify genes closely related to immunotherapy prognosis. Important features are visualized, showing that signatures like M1 macrophages and TMEscore are strongly correlated with favorable prognosis (Figure 5A). These results are validated through the Kaplan-Meier survival analysis (Figure S3A).

In the second step, we use the batch_wilcoxon function to identify gene signatures associated with treatment response (response/non-response). Additionally, the get_cor function allows us to rapidly analyze the correlation between two different variables, facilitating the identification of genes and signatures significantly correlated with target features (Figure 5B). Through these steps, we have identified signatures related to clinical prognosis. We observed distinct clustering relationships between significant biomarkers and treatment response (Figure S4C). To identify potential patterns, we perform clustering on these features. Finally, using the tme_cluster function, we discover that the data can be effectively grouped into four clusters, with TME4 patients exhibiting the best prognosis (Figures 5C and 5D). This indicates the potential to construct a model based on the identified signatures to predict patients’ responses to immunotherapy. Utilizing the TME modeling module, we developed LASSO and Ridge regression models to predict patients’ survival outcomes (Figures S4D and S4E). We selected the LASSO model with the highest area under the receiver operating characteristic curve (AUC) for further analysis. The results demonstrate that the riskscore effectively stratified patients, with those having low riskscore exhibiting significantly longer overall survival (OS) compared to those with high riskscore.

Assessing signature-associated mutations and mapping relevant mutation landscapes in cancer

Tumor stratification based on transcriptomics often faces the challenge that patients within the same TME pattern may exhibit varying treatment outcomes. Genetic mutations, as central elements in tumor development, interact with other biological processes to modulate the transcriptomic landscape, potentially affecting patient prognosis.40 Therefore, it is crucial to recognize that specific gene mutations can initiate cancer, influence interactions between cancer cells and the immune system, and affect responses to immunotherapy across various cancer types.41 To address these complexities, we developed the genome and TME interaction module. This module is designed to meet the needs of understanding and interpreting the dynamics between genomic alterations and TME. To illustrate the module’s functionality, we conducted a comprehensive genomic analysis of the IMvigor210 cohort (Figure 6).

Figure 6.

Figure 6

Genome and TME interaction module delineates mutations associated with TMEscore and the corresponding oncoplot

(A) Boxplots displaying the mutations significantly associated with TMEscore, including TP53, FGFR3, ERBB3, and PIK3CA. The blue and yellow colors represent mutated and wild-type status, respectively.

(B) Kaplan-Meier curves comparing OS between mutated and wild-type status of TP53.

(C) Oncoprints depicting the genomic alteration landscapes in the context of high and low TMEscore. The numbers on the left green bars and on the right side collectively demonstrate the mutation frequency of each gene. WT, wild type; Mut, mutant.

Within the IOBR 2.0 framework, the make_mut_matrix function processes genomic data in mutation annotation format (MAF) format and converts it into a format suitable for the find_mutations function. This function acquires genomic MAF data and the matrix of gene signatures of interest, enabling users to explore relationships between specific genes or signatures and mutations. Based on previous research,11 we investigated the relationship between TMEscore and mutations in urothelial carcinoma. We identified a series of mutations associated with the TMEscore, including TP53, FGFR3, ERBB3, and PIK3CA. The module visualizes differences in TMEscore between wild-type and mutant environments using boxplots (Figures 6A and 6B). Alternatively, oncoplots can be used to visually depict genomic alterations in samples with high and low TMEscore (Figure 6C).

Overall, understanding the relationship between the genome and TME in tumors is critical for identifying potential biomarkers and therapeutic targets. Integrating genomic and transcriptomic data allows researchers to gain insights into the molecular mechanisms of tumorigenesis, uncover interactions between the TME and the genome, and lay the foundation for developing personalized medical strategies. The genome and TME interaction module of IOBR 2.0 significantly simplifies these analytical procedures.

TME data visualization and statistical analysis

To meet the needs of multi-omics analysis, additional visualization and analytical methods have been integrated into the TME data visualization and statistical analysis module. The iobr_deg function integrates multiple differential gene analysis methods, enabling rapid differential expression analysis between two groups. This function employs methods including DESeq242 for RNA-seq data and limma43 for microarray data, and it visually presents differences between groups using volcano plots or heatmaps, aiding in the identification of characteristic genes. For multiple group comparisons, the find_markers_in_bulk function can perform pairwise comparisons in bulk, identifying differential genes associated with group distinctions.

To streamline the feature analysis process, IOBR 2.0 incorporates the iobr_cor_plot function, which is designed for efficient and rapid exploration of various datasets. This function dynamically generates statistical results and effectively illustrates the correlation between signatures and specific phenotypes, such as therapeutic responses or mutation statuses. Additionally, IOBR 2.0 offers the get_cor_matrix and get_cor functions, which calculate the Pearson correlation between two or more features. The sig_box function can be employed to examine the correlation between a categorical variable and a specific signature, generating a boxplot to show the statistical variance in signature scores across different categories. Additionally, the sig_heatmap function visually demonstrates the differences in correlations between features and categories through heatmaps.

Leveraging gene signatures to predict specific phenotypes and survival benefits in response to therapy is a well-established approach in preclinical bioinformatics. The sig_forest function allows users to integrate the survival analysis output generated by batch_surv function and depicts a forest plot with hazard ratios of multiple signatures. The surv_group and sig_surv_plot functions generate Kaplan-Meier survival plots for data grouped by different cutoffs. The sig_roc function, built on the pROC44 R package, effectively generates AUC curves for multiple signatures. The parameter compare method within this function enables users to assess the statistical difference between any two signatures of interest with an optional method. Additionally, the roc_time function, based on the timeROC45 R package, generates time-independent receiver operating characteristic (ROC) curves to evaluate the predictive performance of various variables. Given that IOBR 2.0 can quantify various cell infiltration scores and gene signature scores, the module includes additional batch analysis tools and visualization methods, such as batch correlation analysis and survival analysis, as well as statistical methods like the Wilcoxon test.

By integrating these advanced analytical and visualization functions, IOBR 2.0 enhances the ability to conduct comprehensive and detailed analyses of the TME, thereby facilitating a deeper understanding of tumor biology, aiding in the discovery of novel biomarkers, and advancing the development of precision medicine.

Discussion

The rapid development and widespread application of sequencing technologies have enabled scientists to dissect the tumor’s immune microenvironment from multiple dimensions. Currently, RNA-seq has evolved into a mature and cost-effective analytical technique. It is increasingly utilized to explore cancer-immune interactions and characterize cancer cells and the TME. Computational methods using transcriptomic profiling are instrumental in understanding tumor immunity and in characterizing prognostic and predictive markers of immunotherapy response.18 These methods provide valuable insights into immune response predictive markers, such as estimates of tumor-immune cell infiltration and gene expression signatures.29 Furthermore, the advancement of single-cell sequencing technology offers high-resolution data on immune cell populations and the ability to detect variations between individual cells and cell groups.17,46 By integrating and analyzing multi-omics data, including transcriptomics, single cell, and genomics, we can dissect the underlying gene regulatory mechanisms and reveal the landscape of the TME. This deepens our understanding of the interactions and biological mechanisms between immune and tumor cells.47,48 However, the complexity and growing volume of multi-omics data introduce new opportunities and challenges in analyzing the TME.

In previous research, our team developed IOBR 1.0, a user-friendly and comprehensive analysis tool specifically designed for TME analysis.14 IOBR 1.0 assists users in efficiently and accurately parsing and visualizing the TME, exploring clinically relevant features and biomarkers. In this study, we introduce IOBR 2.0, an upgraded version of IOBR 1.0. In IOBR 2.0, we have expanded its analysis and visualization capabilities and redesigned the workflow. Building on a multi-omics approach with a focus on transcriptomics, IOBR 2.0 comprehensively dissects the tumor immune microenvironment to unearth features related to tumor immunity and immunotherapy responses. IOBR 2.0 offers six main functional modules. It goes beyond effective systemic analysis of transcriptomic, genomic, and single-cell data by compiling a wide range of statistical and clinical analysis methods. Additionally, IOBR 2.0 integrates an array of visualization functions, enhancing model construction and validation capabilities. IOBR 2.0 provides a streamlined and efficient transcriptomic data preprocessing workflow, including data quality control and processing. An estimation function for signatures has been developed in IOBR 2.0 alongside the integration of various cell deconvolution algorithms for rapid parsing and characterization of the TME. Besides the signatures documented in IOBR 2.0, published single-cell signatures have been incorporated. They allow users to customize signatures and gene sets based on their findings in bulk RNA-seq or scRNA-seq data and their oncological insights. Users can then validate their findings in different transcriptomic datasets using IOBR 2.0, including signature score computation and cell deconvolution. Furthermore, unveiling TME patterns and intratumor interactions, providing comprehensive descriptions of the TME, identifying effective biomarkers, and decoding the mechanisms behind patient treatment responses to ultimately predict immunotherapy efficacy have always been critical research directions.1 Addressing this, IOBR 2.0 has added a module for TME interaction analysis. This module employs clustering to determine TME patterns and analyzes the interactions of infiltrating cell receptor-ligand pairs within the microenvironment, providing diverse perspectives on the TME. Additionally, IOBR 2.0 provides various visualization functions suitable for different scenarios, facilitating the batch visualization of TME features and enabling rapid analysis of correlations between TME characteristics and clinical phenotypes. We have also introduced feature screening and model construction functions, assisting clinicians in efficiently identifying targets and biomarkers closely associated with patient treatment prognosis. To demonstrate these functionalities, we have showcased the key features of each module and the foundational strategies for combining different modules using IMvigor210 data.

In conclusion, IOBR 2.0 offers a comprehensive pipeline for downstream transcriptomic analysis of the TME. It enables the integration and analysis of scRNA-seq and genomic data, providing the means to reveal the multidimensional landscape of the TME. IOBR 2.0 has multiple functions for microenvironment analysis, including cell abundance estimation and signature score calculation. It can also analyze TME interactions and integrate traditional analytical and modeling approaches, providing a comprehensive analysis and visualization solution for transcriptome projects. IOBR 2.0 is expected to continue playing a significant role in the future with the ongoing advancements in multi-omics and AI. It is ready to advance research in cancer immunology and immuno-oncology, providing a deeper understanding of tumor immunity and responses to immunotherapy.

Limitations of the study

While IOBR 2.0 integrates transcriptomic and genomic data to explore the impact of the TME on patient phenotypes, it still has some limitations. First, the genomic analysis tools in IOBR 2.0 are relatively limited, only supporting basic analyses. Due to technical constraints, IOBR 2.0 does not support upstream genomic analyses such as somatic mutation prediction, neoantigen prediction, human leukocyte antigen (HLA) typing, and affinity prediction.49 There are many established tools for these analyses, such as Mutect2 (somatic mutation detection),50 OptiType (HLA allele identification),51 and NetMHCpan (HLA binding affinity prediction and neoantigen identification).52 Despite providing some support for downstream analyses, enabling visualization and interpretation of mutation data to help researchers understand how genomic variations influence tumor immunogenicity, IOBR 2.0 still lacks functions for integrating and analyzing these upstream results comprehensively. Second, the functions for scRNA-seq analysis in IOBR 2.0 need further enhancement. Currently, IOBR 2.0 can extract single-cell signatures from Seurat objects and perform cell deconvolution calculations. However, it still cannot perform batch calculations of signatures score based on scRNA-seq data.

In the future, we plan to incorporate epigenetic, proteomic, and single-cell transcriptomic analysis modules into IOBR 2.0 for a more comprehensive exploration of its relationship with the TME. Furthermore, the current architecture of IOBR 2.0 does not fully exploit the S4 object system, a shortfall we plan to address by transitioning key analytical components to S4 objects. This advancement will streamline the analysis pipeline, ensuring a more efficient and integrated workflow suitable for immuno-oncology biological research.

Resource availability

Lead contact

Further information and requests for resources should be directed to and will be fulfilled by the lead contact, Wangjun Liao (liaowj@smu.edu.cn).

Materials availability

This study did not generate new unique reagents.

Data and code availability

  • The RNA-seq data for bladder cancer (IMvigor210) are available in the IMvigor210CoreBiologies R package. The dataset is also accessible at EGA: EGAS00001002556. Raw counts gene expression and clinical information of TCGA-BLCA (n = 430) were downloaded from TCGA (https://portal.gdc.cancer.gov/). The dataset of PBMCs is freely available from 10X Genomics.

  • The IOBR 2.0 R package is available at https://github.com/IOBR/IOBR. The DOI at Zenodo is https://doi.org/10.5281/zenodo.13986663. The GitBook (https://iobr.github.io/book/) provides a complete analysis workflow for each module within the package, including numerous examples and detailed explanations of its functions.

  • Any additional information required to reanalyze the data reported in this work paper is available from the lead contact upon request.

Acknowledgments

The work was supported by Guangdong Provincial Science and Technology Project (no. 2020A0505090007 to J.W.) and Science and Technology Program of Guangzhou (no. 202206080011 to W.L.). The authors thank EGA for providing the multi-omics data of IMvigor210. The authors thank all the patients, their families, investigators, and healthcare professionals who participated in this study. The authors are also grateful to the researchers who generated and shared the sequencing data openly. Some elements of the graphical abstract were created using BioRender.com, with permission.

Author contributions

D.Z. and W.L. contributed to the conceptualization and study design. D.Z., Y.F., P.L., and W.Q. contributed to acquisition. D.Z., Y.F., and Q.M. contributed to data analysis and interpretation. D.Z., Y.F., W.Q., Q.M., and S.W. contributed to package development. D.Z. and Y.F. drafted the tutorial. D.Z., Y.F., and X.X. were responsible for writing the initial draft of the manuscript. D.Z., Y.F., P.L., S.W., G.Y., and W.L. participated in revising the manuscript. W.L., Z.W., and G.Y. supervised the project and reviewed the manuscript. All the authors have read, discussed, and approved the final version of the manuscript. The corresponding author had full access to the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.

Declaration of interests

The authors declare no competing interests.

STAR★Methods

Key resources table

REAGENT or RESOURCE SOURCE IDENTIFIER
Deposited data

RNA-seq data from IMvigor210 Mariathasan et al.9 EGA: EGAS00001002556
TCGA-BLCA RNA-seq data TCGA https://portal.gdc.cancer.gov/
scRNA-seq data of Peripheral Blood Mononuclear Cells (PBMC) 10X Genomics https://cf.10xgenomics.com/samples/cell/pbmc3k/pbmc3k_filtered_gene_bc_matrices.tar.gz

Software and algorithms

IOBR version 2.0 This paper Github: https://github.com/IOBR/IOBR
Zenodo: https://doi.org/10.5281/zenodo.13986663
R 4.2.1 N/A https://www.r-project.org/
R studio N/A https://www.rstudio.com/
Seurat Version 4.1.1 Butler et al.53 http://satijalab.org/seurat
UCSCXenaTools Version 1.4.8 Wang et al.54 https://cran.r-project.org/web/packages/survminer/index.html
maftools Version 2.12.0 Mayakonda et al.55 https://bioconductor.org/packages/release/bioc/html/maftools.html
clusterProfiler Version 4.4.4 Wu et al.56 https://github.com/YuLab-SMU/clusterProfiler
enrichplot Version 1.16.2 Yu et al. https://bioconductor.org/packages/release/bioc/html/enrichplot.html
ComplexHeatmap Version 2.12.1 Gu et al.57 https://bioconductor.org/packages/release/bioc/html/ComplexHeatmap.html
sva Version 3.35.2 Zhang et al.28 https://bioconductor.org/packages/release/bioc/html/sva.html
GSVA Version 1.44.3 Hanzelmann et al.58 https://bioconductor.org/packages/release/bioc/html/GSVA.html
EaSIeR version 1.2.2 Lapuente-Santana et al.38 https://www.bioconductor.org/packages/release/bioc/html/easier.html
msigdbr Version 7.5.1 CRAN Repository https://cran.r-project.org/package=msigdbr
survminer version 0.4.9 CRAN Repository https://cran.r-project.org/web/packages/survminer/index.html
survival version 3.4–0 CRAN Repository https://cran.r-project.org/web/packages/survival/index.html

Method details

The framework of IOBR 2.0

Building upon the existing functions of the original IOBR 1.0, IOBR 2.0 introduces additional analysis and visualization capabilities, with its comprehensive implementation and functionalities thoroughly detailed in the tutorial (https://iobr.github.io/book/) with a complete analysis pipeline.14 IOBR 2.0 encompasses six functional modules: 1) Transcriptome data prepare module (pre-procession of transcriptome data, as well as pertinent batch statistical analyses); 2) TME deconvolution and signature estimation module (estimation of signature scores and identification of phenotype-relevant signatures, along with decoding immune contexture); 3) TME interaction module (clustering TME characteristics and analyzing receptor-ligand interactions); 4) Genome and TME interaction module (analysis of signature associated mutations); 5) TME data visualization and Statistical analysis module (visual representation and statistical examination of TME data); 6) TME modeling module (fast model construction and the assessment of model performance).

Transcriptome data prepare module

In line with the preprocessing workflow of transcriptomic data, we have integrated a variety of functionalities into the IOBR 2.0. IOBR 2.0 allows users to retain genes based on the maximum or average values of duplicate gene expressions. Additionally, we have developed an annotation function to annotate expression matrices. The annotation files in IOBR 2.0 include anno_hug133plus2, anno_rnaseq, and anno_illumina, corresponding to annotations for HG-U133 Plus 2.0 microarray probes, RNA-seq annotation data, and Illumina microarray probes, respectively.

In IOBR 2.0, we have established a function for differential expressed genes (DEGs) analysis between two groups. This function supports two analytical methods, limma43 and DESeq2.42 The limma employs a linear model to assess changes in gene expression, correcting for multiple testing differences using an empirical Bayesian method. Originally designed for microarray data, its utility has been extended to small-sample RNA-seq data analysis. DESeq2, specifically designed for RNA-seq data analysis, uses a negative binomial distribution to model gene expression data, applying either the Wald test or likelihood ratio test to each gene to detect expression differences. Users can choose the appropriate method based on their data type and research needs. Additionally, IOBR 2.0 supports DEG analysis for more than 2 groups. It leverages the Seurat R package to identify significant markers across multiple groups within the dataset.59 The methods available for comparison include bootstrap, delong, and venkatraman, offering a range of options for comprehensive analysis.

In RNA-seq, counts represent the number of reads mapped to exons after upstream data alignment, which are used for gene quantification. However, read counts are influenced by various factors such as sequencing depth and gene length. To mitigate these influences, count reads need to be normalized. Reads Per Kilobase per Million (RPKM), Fragments Per Kilobase per Million (FPKM), and TPM are common methods for gene quantification. Compared to RPKM and FPKM, TPM is more suitable for gene comparisons between samples. Therefore, IOBR 2.0 supports the rapid conversion of gene expression count data into TPM values. During the annotation and conversion processes, additional operations such as merging annotation data with the expression matrices, removing unnecessary columns, transforming rows and columns, and handling duplicates based on the specified method can be simultaneously implemented. For sequencing data from different sources or batches, users can use IOBR 2.0 to examine batch effects in the data and perform batch correction. Furthermore, we have built a filtering function rapid analysis of gene expression data and identification of outliers in the dataset.

TME deconvolution and signature estimation module

Signature estimation

To enhance the characterization of the TME in cancer cells and to deepen our understanding of tumor immunity and its functional states, we have developed an estimation function for user-generated signatures or 322 reported signatures enrolled in IOBR 2.0 (Table S1). The extensive signature collection is categorized into three distinct groups: TME-associated, tumour-metabolism, and tumour-intrinsic signatures. Additionally, IOBR 2.0 supports the estimation of the signature gene sets derived from the GO,31 KEGG,32 HALLMARK,33 and REACTOME34 databases. IOBR 2.0 allows users to generate custom signature lists aligned with their own biological discovery or exploratory needs, thereby streamlining the estimation process and enabling systematic follow-up exploration. Users also have the option to generate signature lists from single-cell differential analysis or the Msigdb database (gsea-msigdb.org) for their subsequent research.

In the evaluation of signature scores, we incorporated three methodologies: ssGSEA, PCA, and Z score. ssGSEA is extensively used to evaluate the enrichment or activity of specific gene sets within individual samples.60 Each ssGSEA enrichment score reflects the collective expression dynamics of a specific gene set in a single sample, indicating whether the genes in the set are collectively upregulated or downregulated in expression. Significantly, PCA calculates the principal components to reduce the dimensionality of data simultaneously preserving the maximum variability in data for predictive model construction. Current signatures constructed using PCA methodology include the Pan-F-TBRs9 and the TMEscore,11,61 two promising biomarkers for predicting clinical outcomes and assessing the sensitivity of malignancies to treatments. Z score, a statistical metric, measures a score’s deviation from the mean of a dataset in standard deviations.

TME deconvolution

Different mechanisms in the TME are involved in mediating the immune response and affect the efficacy of treatment. The important aspect is the cell-type composition of the TME, which is the key element shaping the intricate landscape of anti-tumor immunity.12 Deciphering the cellular composition of the TME is a significant technical challenge, addressed by various deconvolution algorithms, each with its unique advantages and limitations.62,63 IOBR 2.0 integrates 10 open-source deconvolution methodologies: CIBERSORT, Support Vector Regression (SVR),18 Least Squares Estimate of Imbalance (LSEI),64 ESTIMATE,21 quanTIseq,20 TIMER,22 IPS,23 MCPCounter,24 xCell,25 and EPIC.19

CIBERSORT is the most well-recognized method for identifying 22 immune cell types in the TME, allowing large-scale analysis of transcriptome data for cellular biomarkers and therapeutic targets with promising accuracy.18 Notably, IOBR 2.0 leverages CIBERSORT’s linear vector regression principle, allowing users to create custom signatures and extending its input file compatibility to cell subsets derived from single-cell sequencing results. ESTIMATE focuses on non-malignant components, like stromal and immune signatures, to assess tumor purity.21 The quanTIseq method quantifies 10 immune cell subsets from bulk RNA-seq data.20 TIMER is adept at quantifying the abundance of tumour-infiltrating immune compartments.22 It offers six major analytic modules, enabling detailed analysis of immune infiltration alongside other cancer molecular profiles. IPS calculates 28 TIL subpopulations, including effector and memory T cells and immunosuppressive cells.23 MCP-counter robustly quantifies the absolute abundance of eight immune and two stromal cell populations within heterogeneous tissues, using transcriptomic data.24 xCell offers an extensive analysis of 64 immune cell types from RNA-seq data, including various cell subsets in bulk tumor tissues.25 EPIC decodes the proportion of immune and cancer cells from the expression of genes, comparing it to specific cell expression profiles to accurately predict the cellular subpopulation landscape.19

Signatures derived from scRNA-seq data

Moving beyond traditional bulk sequencing, validating the clinical relevance of cell types identified by scRNA-seq becomes essential. To facilitate this, IOBR 2.0 integrates linear SVR of CIBERSORT with the LSEI algorithms,64 enabling a streamlined analysis of bulk RNA-seq data for the clinical validation of targets identified through scRNA-seq data. In addition, IOBR 2.0 allows the user to screen DEGs of cell types based on Seurat objects,59 performing cellular deconvolution related to the research objectives or constructing signature gene lists for IOBR 2.0 scoring calculations. Furthermore, IOBR 2.0 can derive gene signatures from single-cell differential analysis for use in feature scoring calculations.

TME interaction module

To facilitate a deeper analysis of the TME and identify distinct TME patterns in patients, we have developed a clustering function based on the NbClust R package.65 This function enables unsupervised clustering analysis using datasets generated by users or signature estimation scores from IOBR 2.0. Based on the results, IOBR 2.0 can determine the optimal number of clusters and assign each sample to a specific cluster. Additionally, IOBR 2.0 offers a function for analyzing ligand-receptor pairs within the TME. It evaluates 813 pairs of ligand-receptor interactions based on gene expression patterns. These pairs are expressed in 25 cell types that are present in the TME, including immune cells, cancer cells, fibroblasts, endothelial cells, and adipocytes.38 Users provide transcriptomic data as input, allowing IOBR 2.0 to generate group-specific, system-based signatures of the TME. A pairwise Wilcoxon test is then employed to identify distinctive signatures and ligand-receptor interactions between different groups.

Genome and TME interaction module

IOBR 2.0 not only focuses on systematic signature-phenotype studies but also expands its research scope to include the exploration of interactions between transcriptomes, microenvironments, and genome profiles. It accepts genomic data in MAF55 or user-generated mutation matrices as input for identifying mutations associated with specific signatures. Additionally, IOBR 2.0 supports the transformation of MAF data into a comprehensive mutation matrix. This matrix contains data on distinct variation types, including insertion-deletion mutations (indels), single-nucleotide polymorphisms (SNPs), and frameshift mutations, or it can integrate all these mutation types, offering users flexible selection options. For the analysis of mutations significantly linked to targeted signatures, IOBR 2.0 employs the Wilcoxon rank-sum test in this module for batch analysis. Moreover, IOBR 2.0 supports batch visualization, allowing users to easily view and interpret the mutation status (mutation or non-mutation) of specified genes or regions.

TME data visualization and statistical analysis module

Batch analysis and visualization of results from the TME deconvolution and signature estimation module are pivotal features of IOBR 2.0. To implement TME deconvolution and signature computation for potential clinical translation, we have systematically categorized the collected signatures into 43 groups (Table S2), expanding upon the foundation of IOBR 2.0. These categories encompass TME cell populations (classified by deconvolution methods, cell types, or scRNA-seq results), signatures of immune phenotypes, tumor metabolism, HALLMARK and so on. Users can freely adjust the number of signatures within each group and also utilize signatures documented in IOBR 2.0 to construct new groups for research exploration. IOBR 2.0 also supports the construction of new signature groups based on immune-oncological research findings or specific study objectives, enabling users to tailor their analysis to their unique research needs. Further, we integrate a visualization function specifically for batch correlation analysis of signature groups, either user-generated or enrolled in IOBR 2.0. This function allows for visualizations based on specified groups, including boxplots and heatmaps, and employs the Wilcoxon rank-sum Test to compare statistical differences in signatures between groups. Moreover, IOBR 2.0 is capable of presenting TME cell fractions as percentage bar charts in batch visualization, supporting input of deconvolution results from "CIBERSORT",18 "EPIC"19 and "quanTIseq"20 methodologies to further compare the TME cell distributions within one sample or among different samples.

To provide a more intuitive understanding of the TME and streamline the analysis process, IOBR 2.0 has introduced a range of batch visualization and statistical functions. The batch analysis methods supported by IOBR 2.0 include batch Wilcoxon rank-sum test between two groups, batch calculation of hazard ratios and confidence intervals for the specified signature, and batch analysis of correlation. IOBR 2.0 supports computing correlations between two features or genes, or between a target variable and multiple variables, visualizing these correlations through heatmaps or scatterplots. It supports two correlation methods: Pearson correlation coefficient or Spearman’s rank correlation coefficient. Furthermore, IOBR 2.0 provides other independent analysis and visualization functions, including KM survival analysis, GSEA, and PCA analysis. Notably, IOBR 2.0 allows users to perform GSEA based on user-generated signatures or signatures registered in IOBR 2.0. The TME data visualization and statistical analysis module of IOBR 2.0 collectively enable easy integration and visualization of the aforementioned deconvolution results, offering flexibility in selecting specific methodologies of interest. This module permits systematic identification of phenotype-relevant signatures, cell fractions, or signature genes, accompanied by corresponding batch statistical analyses and visualization options. Within IOBR 2.0, these methods are available for users to choose for targeted analysis or integration, complemented by a range of visualization tools.

TME modeling module

For effective application of the signatures in clinical interpretation, IOBR 2.0 provides functions for feature selection, robust biomarker identification, and model construction based on previously identified phenotype associated signatures. Utilizing these features to build prognostic models holds promise for accurately and cost-effectively predicting tumor patient survival and treatment sensitivities. In addition, IOBR 2.0 supports the performance assessment of models predicting patient survival and treatment responses, providing valuable tools for evaluating the efficacy and applicability of these models in clinical settings.

Quantification and statistical analysis

For two-group comparisons, an unpaired Student’s t test was used for normally distributed variables, while the Mann-Whitney U test (Wilcoxon Rank-Sum test) was utilized for non-normally distributed variables. In analyses involving more than two groups, the Kruskal-Wallis test was used for non-parametric data, and one-way ANOVA for parametric data. Correlation analysis was conducted using Spearman’s rank correlation, with p-values calculated using exact methods for small sample sizes and asymptotic approximations for larger samples. Multiple testing corrections were applied using the Benjamini-Hochberg method. Survival data were analyzed using the Cox proportional hazards regression model to estimate hazard ratios and 95% confidence intervals for each variable. Subgroup differences in survival were assessed using the Kaplan-Meier method, with statistical significance determined by the log rank (Mantel-Cox) test. All statistical analyses were conducted in R (version 4.2.1) (https://www.r-project.org/), with two-sided p values. Statistical significance was defined as p < 0.05, with Benjamini-Hochberg correction applied for multiple comparisons.66

Published: December 2, 2024

Footnotes

Supplemental information can be found online at https://doi.org/10.1016/j.crmeth.2024.100910.

Contributor Information

Zuqiang Wu, Email: lywuzq@scut.edu.cn.

Guangchuang Yu, Email: gcyu1@smu.edu.cn.

Wangjun Liao, Email: liaowj@smu.edu.cn.

Supplemental information

Document S1. Figures S1–S4
mmc1.pdf (5.3MB, pdf)
Table S1. Collection and relevant citations of all 322 signatures enrolled in the IOBR 2.0, related to STAR Methods
mmc2.xlsx (24.2KB, xlsx)
Table S2. The signatures collected in IOBR 2.0 were systematically categorized into 43 groups, related to STAR Methods
mmc3.xlsx (15.7KB, xlsx)
Document S2. Article plus supplemental information
mmc4.pdf (12.6MB, pdf)

References

  • 1.Robert C. A decade of immune-checkpoint inhibitors in cancer therapy. Nat. Commun. 2020;11:3801. doi: 10.1038/s41467-020-17670-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Morad G., Helmink B.A., Sharma P., Wargo J.A. Hallmarks of response, resistance, and toxicity to immune checkpoint blockade. Cell. 2021;184:5309–5337. doi: 10.1016/j.cell.2021.09.020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.de Visser K.E., Joyce J.A. The evolving tumor microenvironment: From cancer initiation to metastatic outgrowth. Cancer Cell. 2023;41:374–403. doi: 10.1016/j.ccell.2023.02.016. [DOI] [PubMed] [Google Scholar]
  • 4.Ye Z., Zeng D., Zhou R., Shi M., Liao W. Tumor Microenvironment Evaluation for Gastrointestinal Cancer in the Era of Immunotherapy and Machine Learning. Front. Immunol. 2022;13 doi: 10.3389/fimmu.2022.819807. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Zeng D., Wang M., Wu J., Lin S., Ye Z., Zhou R., Wang G., Wu J., Sun H., Bin J., et al. Immunosuppressive Microenvironment Revealed by Immune Cell Landscape in Pre-metastatic Liver of Colorectal Cancer. Front. Oncol. 2021;11 doi: 10.3389/fonc.2021.620688. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Duan R., Li X., Zeng D., Chen X., Shen B., Zhu D., Zhu L., Yu Y., Wang D. Tumor Microenvironment Status Predicts the Efficacy of Postoperative Chemotherapy or Radiochemotherapy in Resected Gastric Cancer. Front. Immunol. 2020;11 doi: 10.3389/fimmu.2020.609337. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Combes A.J., Samad B., Krummel M.F. Defining and using immune archetypes to classify and treat cancer. Nat. Rev. Cancer. 2023;23:491–505. doi: 10.1038/s41568-023-00578-2. [DOI] [PubMed] [Google Scholar]
  • 8.Cristescu R., Mogg R., Ayers M., Albright A., Murphy E., Yearley J., Sher X., Liu X.Q., Lu H., Nebozhyn M., et al. Pan-tumor genomic biomarkers for PD-1 checkpoint blockade-based immunotherapy. Science. 2018;362 doi: 10.1126/science.aar3593. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Mariathasan S., Turley S.J., Nickles D., Castiglioni A., Yuen K., Wang Y., Kadel E.E., III, Koeppen H., Astarita J.L., Cubas R., et al. TGFβ attenuates tumour response to PD-L1 blockade by contributing to exclusion of T cells. Nature. 2018;554:544–548. doi: 10.1038/nature25501. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Jiang P., Gu S., Pan D., Fu J., Sahu A., Hu X., Li Z., Traugh N., Bu X., Li B., et al. Signatures of T cell dysfunction and exclusion predict cancer immunotherapy response. Nat. Med. 2018;24:1550–1558. doi: 10.1038/s41591-018-0136-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Zeng D., Li M., Zhou R., Zhang J., Sun H., Shi M., Bin J., Liao Y., Rao J., Liao W. Tumor Microenvironment Characterization in Gastric Cancer Identifies Prognostic and Immunotherapeutically Relevant Gene Signatures. Cancer Immunol. Res. 2019;7:737–750. doi: 10.1158/2326-6066.CIR-18-0436. [DOI] [PubMed] [Google Scholar]
  • 12.Fridman W.H., Zitvogel L., Sautès-Fridman C., Kroemer G. The immune contexture in cancer prognosis and treatment. Nat. Rev. Clin. Oncol. 2017;14:717–734. doi: 10.1038/nrclinonc.2017.101. [DOI] [PubMed] [Google Scholar]
  • 13.Zhou Y., Tao L., Qiu J., Xu J., Yang X., Zhang Y., Tian X., Guan X., Cen X., Zhao Y. Tumor biomarkers for diagnosis, prognosis and targeted therapy. Signal Transduct. Targeted Ther. 2024;9:132. doi: 10.1038/s41392-024-01823-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Zeng D., Ye Z., Shen R., Yu G., Wu J., Xiong Y., Zhou R., Qiu W., Huang N., Sun L., et al. IOBR: Multi-Omics Immuno-Oncology Biological Research to Decode Tumor Microenvironment and Signatures. Front. Immunol. 2021;12 doi: 10.3389/fimmu.2021.687975. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Chen C., Lin C.J., Pei Y.C., Ma D., Liao L., Li S.Y., Fan L., Di G.H., Wu S.Y., Liu X.Y., et al. Comprehensive genomic profiling of breast cancers characterizes germline-somatic mutation interactions mediating therapeutic vulnerabilities. Cell Discov. 2023;9:125. doi: 10.1038/s41421-023-00614-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Li C., Yang L., Zhang Y., Hou Q., Wang S., Lu S., Tao Y., Hu W., Zhao L. Integrating single-cell and bulk transcriptomic analyses to develop a cancer-associated fibroblast-derived biomarker for predicting prognosis and therapeutic response in breast cancer. Front. Immunol. 2023;14 doi: 10.3389/fimmu.2023.1307588. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Rood J.E., Maartens A., Hupalowska A., Teichmann S.A., Regev A. Impact of the Human Cell Atlas on medicine. Nat. Med. 2022;28:2486–2496. doi: 10.1038/s41591-022-02104-7. [DOI] [PubMed] [Google Scholar]
  • 18.Newman A.M., Steen C.B., Liu C.L., Gentles A.J., Chaudhuri A.A., Scherer F., Khodadoust M.S., Esfahani M.S., Luca B.A., Steiner D., et al. Determining cell type abundance and expression from bulk tissues with digital cytometry. Nat. Biotechnol. 2019;37:773–782. doi: 10.1038/s41587-019-0114-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Racle J., de Jonge K., Baumgaertner P., Speiser D.E., Gfeller D. Simultaneous enumeration of cancer and immune cell types from bulk tumor gene expression data. Elife. 2017;6 doi: 10.7554/eLife.26476. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Finotello F., Mayer C., Plattner C., Laschober G., Rieder D., Hackl H., Krogsdam A., Loncova Z., Posch W., Wilflingseder D., et al. Molecular and pharmacological modulators of the tumor immune contexture revealed by deconvolution of RNA-seq data. Genome Med. 2019;11:34. doi: 10.1186/s13073-019-0638-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Yoshihara K., Shahmoradgoli M., Martínez E., Vegesna R., Kim H., Torres-Garcia W., Treviño V., Shen H., Laird P.W., Levine D.A., et al. Inferring tumour purity and stromal and immune cell admixture from expression data. Nat. Commun. 2013;4 doi: 10.1038/ncomms3612. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Li B., Severson E., Pignon J.C., Zhao H., Li T., Novak J., Jiang P., Shen H., Aster J.C., Rodig S., et al. Comprehensive analyses of tumor immunity: implications for cancer immunotherapy. Genome Biol. 2016;17:174. doi: 10.1186/s13059-016-1028-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Charoentong P., Finotello F., Angelova M., Mayer C., Efremova M., Rieder D., Hackl H., Trajanoski Z. Pan-cancer Immunogenomic Analyses Reveal Genotype-Immunophenotype Relationships and Predictors of Response to Checkpoint Blockade. Cell Rep. 2017;18:248–262. doi: 10.1016/j.celrep.2016.12.019. [DOI] [PubMed] [Google Scholar]
  • 24.Becht E., Giraldo N.A., Lacroix L., Buttard B., Elarouci N., Petitprez F., Selves J., Laurent-Puig P., Sautès-Fridman C., Fridman W.H., de Reyniès A. Estimating the population abundance of tissue-infiltrating immune and stromal cell populations using gene expression. Genome Biol. 2016;17:218. doi: 10.1186/s13059-016-1070-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Aran D., Hu Z., Butte A.J. xCell: digitally portraying the tissue cellular heterogeneity landscape. Genome Biol. 2017;18 doi: 10.1186/s13059-017-1349-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Yang Y., Zhao Y., Liu X., Huang J. Artificial intelligence for prediction of response to cancer immunotherapy. Semin. Cancer Biol. 2022;87:137–147. doi: 10.1016/j.semcancer.2022.11.008. [DOI] [PubMed] [Google Scholar]
  • 27.Langfelder P., Horvath S. WGCNA: an R package for weighted correlation network analysis. BMC Bioinf. 2008;9:559. doi: 10.1186/1471-2105-9-559. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Zhang Y., Parmigiani G., Johnson W.E. ComBat-seq: batch effect adjustment for RNA-seq count data. NAR Genom. Bioinform. 2020;2 doi: 10.1093/nargab/lqaa078. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Yang L., Wang J., Altreuter J., Jhaveri A., Wong C.J., Song L., Fu J., Taing L., Bodapati S., Sahu A., et al. Tutorial: integrative computational analysis of bulk RNA-sequencing data to characterize tumor immunity using RIMA. Nat. Protoc. 2023;18:2404–2414. doi: 10.1038/s41596-023-00841-8. [DOI] [PubMed] [Google Scholar]
  • 30.Sturm G., Finotello F., Petitprez F., Zhang J.D., Baumbach J., Fridman W.H., List M., Aneichyk T. Comprehensive evaluation of transcriptome-based cell-type quantification methods for immuno-oncology. Bioinformatics. 2019;35:i436–i445. doi: 10.1093/bioinformatics/btz363. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Gene Ontology Consortium The Gene Ontology resource: enriching a GOld mine. Nucleic Acids Res. 2021;49:D325–D334. doi: 10.1093/nar/gkaa1113. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Kanehisa M., Sato Y., Kawashima M., Furumichi M., Tanabe M. KEGG as a reference resource for gene and protein annotation. Nucleic Acids Res. 2016;44:D457–D462. doi: 10.1093/nar/gkv1070. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Liberzon A., Birger C., Thorvaldsdóttir H., Ghandi M., Mesirov J.P., Tamayo P. The Molecular Signatures Database (MSigDB) hallmark gene set collection. Cell Syst. 2015;1:417–425. doi: 10.1016/j.cels.2015.12.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Croft D., O'Kelly G., Wu G., Haw R., Gillespie M., Matthews L., Caudy M., Garapati P., Gopinath G., Jassal B., et al. Reactome: a database of reactions, pathways and biological processes. Nucleic Acids Res. 2011;39:D691–D697. doi: 10.1093/nar/gkq1018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Svensson V., Vento-Tormo R., Teichmann S.A. Exponential scaling of single-cell RNA-seq in the past decade. Nat. Protoc. 2018;13:599–604. doi: 10.1038/nprot.2017.149. [DOI] [PubMed] [Google Scholar]
  • 36.Zhang J., Huang D., Saw P.E., Song E. Turning cold tumors hot: from molecular mechanisms to clinical applications. Trends Immunol. 2022;43:523–545. doi: 10.1016/j.it.2022.04.010. [DOI] [PubMed] [Google Scholar]
  • 37.Chen D.S., Mellman I. Elements of cancer immunity and the cancer–immune set point. Nature. 2017;541:321–330. doi: 10.1038/nature21349. [DOI] [PubMed] [Google Scholar]
  • 38.Lapuente-Santana Ó., van Genderen M., Hilbers P.A.J., Finotello F., Eduati F. Interpretable systems biomarkers predict response to immune-checkpoint inhibitors. Patterns (N Y) 2021;2 doi: 10.1016/j.patter.2021.100293. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Swanson K., Wu E., Zhang A., Alizadeh A.A., Zou J. From patterns to patients: Advances in clinical machine learning for cancer diagnosis, prognosis, and treatment. Cell. 2023;186:1772–1791. doi: 10.1016/j.cell.2023.01.035. [DOI] [PubMed] [Google Scholar]
  • 40.Flavahan W.A., Drier Y., Liau B.B., Gillespie S.M., Venteicher A.S., Stemmer-Rachamimov A.O., Suvà M.L., Bernstein B.E. Insulator dysfunction and oncogene activation in IDH mutant gliomas. Nature. 2016;529:110–114. doi: 10.1038/nature16490. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Wang D., Liu B., Zhang Z. Accelerating the understanding of cancer biology through the lens of genomics. Cell. 2023;186:1755–1771. doi: 10.1016/j.cell.2023.02.015. [DOI] [PubMed] [Google Scholar]
  • 42.Love M.I., Huber W., Anders S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 2014;15 doi: 10.1186/s13059-014-0550-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Ritchie M.E., Phipson B., Wu D., Hu Y., Law C.W., Shi W., Smyth G.K. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 2015;43:e47. doi: 10.1093/nar/gkv007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Robin X., Turck N., Hainard A., Tiberti N., Lisacek F., Sanchez J.C., Müller M. pROC: an open-source package for R and S+ to analyze and compare ROC curves. BMC Bioinf. 2011;12:77. doi: 10.1186/1471-2105-12-77. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Blanche P., Dartigues J.F., Jacqmin-Gadda H. Estimating and comparing time-dependent areas under receiver operating characteristic curves for censored event times with competing risks. Stat. Med. 2013;32:5381–5397. doi: 10.1002/sim.5958. [DOI] [PubMed] [Google Scholar]
  • 46.Lei Y., Tang R., Xu J., Wang W., Zhang B., Liu J., Yu X., Shi S. Applications of single-cell sequencing in cancer research: progress and perspectives. J. Hematol. Oncol. 2021;14:91. doi: 10.1186/s13045-021-01105-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Baysoy A., Bai Z., Satija R., Fan R. The technological landscape and applications of single-cell multi-omics. Nat. Rev. Mol. Cell Biol. 2023;24:695–713. doi: 10.1038/s41580-023-00615-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Tang X., Zhang J., He Y., Zhang X., Lin Z., Partarrieu S., Hanna E.B., Ren Z., Shen H., Yang Y., et al. Explainable multi-task learning for multi-modality biological data analysis. Nat. Commun. 2023;14:2546. doi: 10.1038/s41467-023-37477-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Bonsack M., Hoppe S., Winter J., Tichy D., Zeller C., Küpper M.D., Schitter E.C., Blatnik R., Riemer A.B. Performance Evaluation of MHC Class-I Binding Prediction Tools Based on an Experimentally Validated MHC-Peptide Binding Data Set. Cancer Immunol. Res. 2019;7:719–736. doi: 10.1158/2326-6066.CIR-18-0584. [DOI] [PubMed] [Google Scholar]
  • 50.Cibulskis K., Lawrence M.S., Carter S.L., Sivachenko A., Jaffe D., Sougnez C., Gabriel S., Meyerson M., Lander E.S., Getz G. Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples. Nat. Biotechnol. 2013;31:213–219. doi: 10.1038/nbt.2514. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Szolek A., Schubert B., Mohr C., Sturm M., Feldhahn M., Kohlbacher O. OptiType: precision HLA typing from next-generation sequencing data. Bioinformatics. 2014;30:3310–3316. doi: 10.1093/bioinformatics/btu548. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Reynisson B., Alvarez B., Paul S., Peters B., Nielsen M. NetMHCpan-4.1 and NetMHCIIpan-4.0: improved predictions of MHC antigen presentation by concurrent motif deconvolution and integration of MS MHC eluted ligand data. Nucleic Acids Res. 2020;48:W449–W454. doi: 10.1093/nar/gkaa379. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Butler A., Hoffman P., Smibert P., Papalexi E., Satija R. Integrating single-cell transcriptomic data across different conditions, technologies, and species. Nat. Biotechnol. 2018;36:411–420. doi: 10.1038/nbt.4096. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Wang S., Liu X. The UCSCXenaTools R package: a toolkit for accessing genomics data from UCSC Xena platform, from cancer multi-omics to single-cell RNA-seq. J. Open Source Softw. 2019;4:1627. doi: 10.21105/joss.01627. [DOI] [Google Scholar]
  • 55.Mayakonda A., Lin D.C., Assenov Y., Plass C., Koeffler H.P. Maftools: efficient and comprehensive analysis of somatic variants in cancer. Genome Res. 2018;28:1747–1756. doi: 10.1101/gr.239244.118. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Wu T., Hu E., Xu S., Chen M., Guo P., Dai Z., Feng T., Zhou L., Tang W., Zhan L., et al. clusterProfiler 4.0: A universal enrichment tool for interpreting omics data. Innovation. 2021;2 doi: 10.1016/j.xinn.2021.100141. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Gu Z., Eils R., Schlesner M. Complex heatmaps reveal patterns and correlations in multidimensional genomic data. Bioinformatics. 2016;32:2847–2849. doi: 10.1093/bioinformatics/btw313. [DOI] [PubMed] [Google Scholar]
  • 58.Hanzelmann S., Castelo R., Guinney J. GSVA: gene set variation analysis for microarray and RNA-seq data. BMC Bioinf. 2013;14:7. doi: 10.1186/1471-2105-14-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Hao Y., Hao S., Andersen-Nissen E., Mauck W.M., 3rd, Zheng S., Butler A., Lee M.J., Wilk A.J., Darby C., Zager M., et al. Integrated analysis of multimodal single-cell data. Cell. 2021;184:3573–3587.e29. doi: 10.1016/j.cell.2021.04.048. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Barbie D.A., Tamayo P., Boehm J.S., Kim S.Y., Moody S.E., Dunn I.F., Schinzel A.C., Sandy P., Meylan E., Scholl C., et al. Systematic RNA interference reveals that oncogenic KRAS-driven cancers require TBK1. Nature. 2009;462:108–112. doi: 10.1038/nature08460. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Zeng D., Wu J., Luo H., Li Y., Xiao J., Peng J., Ye Z., Zhou R., Yu Y., Wang G., et al. Tumor microenvironment evaluation promotes precise checkpoint immunotherapy of advanced gastric cancer. J. Immunother. Cancer. 2021;9 doi: 10.1136/jitc-2021-002467. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Avila Cobos F., Alquicira-Hernandez J., Powell J.E., Mestdagh P., De Preter K. Benchmarking of cell type deconvolution pipelines for transcriptomics data. Nat. Commun. 2020;11:5650. doi: 10.1038/s41467-020-19015-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Chen Z., Ji C., Shen Q., Liu W., Qin F.X.F., Wu A. Tissue-specific deconvolution of immune cell composition by integrating bulk and single-cell transcriptomes. Bioinformatics. 2020;36:819–827. doi: 10.1093/bioinformatics/btz672. [DOI] [PubMed] [Google Scholar]
  • 64.Gong T., Szustakowski J.D. DeconRNASeq: a statistical framework for deconvolution of heterogeneous tissue samples based on mRNA-Seq data. Bioinformatics. 2013;29:1083–1085. doi: 10.1093/bioinformatics/btt090. [DOI] [PubMed] [Google Scholar]
  • 65.Charrad M., Ghazzali N., Boiteau V., Niknafs A. {NbClust}: An {R} Package for Determining the Relevant Number of Clusters in a Data Set. J. Stat. Software. 2014;61 [Google Scholar]
  • 66.Benjamini Y., Hochberg Y. Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing. J. Roy. Stat. Soc. B. 1995;57:289–300. doi: 10.1111/j.2517-6161.1995.tb02031.x. [DOI] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Document S1. Figures S1–S4
mmc1.pdf (5.3MB, pdf)
Table S1. Collection and relevant citations of all 322 signatures enrolled in the IOBR 2.0, related to STAR Methods
mmc2.xlsx (24.2KB, xlsx)
Table S2. The signatures collected in IOBR 2.0 were systematically categorized into 43 groups, related to STAR Methods
mmc3.xlsx (15.7KB, xlsx)
Document S2. Article plus supplemental information
mmc4.pdf (12.6MB, pdf)

Data Availability Statement

  • The RNA-seq data for bladder cancer (IMvigor210) are available in the IMvigor210CoreBiologies R package. The dataset is also accessible at EGA: EGAS00001002556. Raw counts gene expression and clinical information of TCGA-BLCA (n = 430) were downloaded from TCGA (https://portal.gdc.cancer.gov/). The dataset of PBMCs is freely available from 10X Genomics.

  • The IOBR 2.0 R package is available at https://github.com/IOBR/IOBR. The DOI at Zenodo is https://doi.org/10.5281/zenodo.13986663. The GitBook (https://iobr.github.io/book/) provides a complete analysis workflow for each module within the package, including numerous examples and detailed explanations of its functions.

  • Any additional information required to reanalyze the data reported in this work paper is available from the lead contact upon request.


Articles from Cell Reports Methods are provided here courtesy of Elsevier

RESOURCES