Skip to main content
Journal of Oncology logoLink to Journal of Oncology
. 2022 Sep 16;2022:8177948. doi: 10.1155/2022/8177948

Construction of Molecular Subtype and Prognosis Prediction Model of Osteosarcoma Based on Aging-Related Genes

Chunli Dong 1,, Yindi Sun 2, Ying Zhang 1, Bianni Qin 1, Tao Lei 2
PMCID: PMC9507679  PMID: 36157228

Abstract

Background

Osteosarcoma (OS) is a rare form of malignant bone cancer that is usually detected in young adults and adolescents. This disease shows a poor prognosis owing to its metastatic status and resistance to chemotherapy. Hence, it is necessary to design a risk model that can successfully forecast the OS prognosis in patients.

Methods

The researchers retrieved the RNA sequencing data and follow-up clinical data related to OS patients from the TARGET and GEO databases, respectively. The coxph function in R software was used for carrying out the Univariate Cox regression analysis for deriving the aging-based genes related sto the OS prognosis. The researchers conducted consistency clustering using the ConcensusClusterPlus R package. The R software package ESTIMATE, MCPcounter, and GSVA packages were used for assessing the immune scores of various subtypes using the ssGSEA technique, respectively. The Univariate Cox and Lasso regression analyses were used for screening and developing a risk model. The ROC curves were constructed, using the pROC package. The performance of their developed risk model and designed survival curve was conducted, with the help of the Survminer package.

Results

The OS patients were classified into 2 categories, as per the aging-related genes. The results revealed that the Cluster 1 patients showed a better prognosis than the Cluster 2 patients. Both clusters showed different immune microenvironments. Additional screening of the prognosis-associated genes revealed the presence of 5 genes, i.e., ERCC4, GPX4, EPS8, TERT, and STAT5A, and these data were used for developing the risk model. This risk model categorized the training set samples into the high- and low-risk groups. The patients classified into the high-risk group showed a poor OS prognosis compared to the low-risk patients. The researchers verified the reliability and robustness of the designed 5-gene signature using the internal and external datasets. This risk model was able to effectively predict the prognosis even in the samples having differing clinical features. Compared with other models, the 5- gene model performs better in predicting the risk of osteosarcoma.

Conclusion

The 5-gene signature developed by the researchers in this study could be effectively used for forecasting the OS prognosis in patients.

1. Introduction

An osteosarcoma is a malignant form of tumor affecting the bones. It originates in the mesenchymal tissue and is expressed in the proximal tibia and distal femur tissues. OS shows a unique characteristic feature, wherein the tumor cells tend to directly form bone-like tissue or immature bones [1]. Osteosarcoma is mostly noted in adolescents and people below the age of 20 years, with a high degree of malignancy and easy pulmonary metastasis [2]. Studies have found that there is a close correlation between the rapid growth of adolescent bones and the onset and progression of osteosarcoma [3]. With the development of clinical diagnosis technology and surgical treatment technology, the nonmetastatic OS patients showed a better 5-year overall survival rate of 60–70% [4]; however, the patients with recurring or metastatic osteosarcoma showed a 5-year overall survival rate of only 20% [5]. Therefore, researchers need to determine the important regulatory targets for the occurrence and metastasis of osteosarcoma and develop new prognostic markers for osteosarcoma patients.

Cell aging is a generally stable state in which cells stop going through the cell cycle as a result of changes in the external microenvironment or internal gene expression and inactivation and lose their capacity to multiply indefinitely [6]. In the past, researchers have noted that cellular aging was related to tumorigenesis, tumor development, and escape therapy. In the early stage of tumorigenesis, the inflammatory reaction is conducive to the elimination of aging and mutant cells and the prevention and inhibition of tumorigenesis [7]. However, during the later stages of tumor development, there is a change in the inflammatory microenvironment, which consists primarily of the growth factors and inflammatory molecules that are secreted by aging cells, which can induce the epithelial-mesenchymal conversion of the tumor cells and promote migration, proliferation, invasion, and the metastasis of tumor cells [8]. Hence, the researchers need to study the potential role of aging in tumorigenesis and development.

In this study, the researchers acquired the RNA-Seq data of 85 osteosarcoma patients using the TARGET database and classified the data into 2 groups based on the aging-related genes. Depending on the genes significantly related to prognosis, they further constructed a 5-gene risk model that included genes like TERT, GPX4, ERCC4, EPS8, and STAT5A. This 5-gene risk model is effective in forecasting the OS prognosis of the patients.

2. Materials and Methods

2.1. Analysis Process

Figure 1 presents the analytical flow chart used in this paper.

Figure 1.

Figure 1

Analysis of the flow chart.

2.2. Source and Pretreatment of Data

The researchers retrieved the RNA-Seq data for the osteosarcoma (OS) patients along with their clinical follow-up data from the TARGET database. They also downloaded the Gene Expression Omnibus (GEO) data from the GEO database and selected the GSE21257 chip data in addition to the lifespan of the OS patients. This data set included the expression data of 53 tissue samples.

The RNA-Seq data of 85 TARGET-OS cases were processed as follows: (1) eliminating all samples that did not contain the clinical follow-up data, (2) discarding the samples that did not present the overall survival data, and (3) eliminating the samples that did not reflect the patients' status.

The researchers processed the data sets for the 53 GEO patients as follows: (1) eliminating samples that did not include the clinical follow-up data, (2) discarding all samples that did not present the overall survival, and (3) eliminating the samples that did not reflect the patients' status. After the pretreatment of the two groups of data, the TARGET-OS included 85 samples, comprising 302 genes (Supplement Table 1), while the GSE21257 consisted of 53 samples that were included in the external, independent verification dataset. Table 1 describes the clinical information for the population sample.

Table 1.

Sample information table.

Clinical features TARGET-OS GSE21257
OS
0 56 30
1 29 23

Gender
Male 48 34
Female 37 19

Age
≤15 46 21
>15 39 32

2.3. Molecular Typing of the Genes Based on the Aging-Related Genes

Firstly, the researchers extracted the gene expression profiles of the 302 aging-linked genes from the TARGET database. Then, they used the coxph function in R software for carrying out the Univariate Cox regression analysis for deriving the genes that were linked to the disease prognosis in the OS patients. Then, the TARGET-OS samples were clustered using the ConcensusClusterPlus R software package (distance parameter was Euclidean, ClusterAlg parameter was km), and heat maps were drawn based on the prognostic genes. The survival curve of osteosarcoma was drawn based on OS data.

For determining the correlation between different molecular subtypes and immune scores, the researchers used the ESTIMATE function in the R software GSVA package for assessing the 3 immune scores, i.e., ImmuneScore, StromalScore, and ESTIMATEScore. They used the MCPcounter for determining the scores of 10 different immune cells. Then, they implemented the ssGSEA technique using the GSVA package for scoring 28 immune cells [9]. Lastly, they compared all variations occurring in the immune scores for the numerous molecular subtypes.

2.4. Constructing a Prognostic Risk Model as per the Aging-Related Genes

The researchers retrieved the expression profiles of the aging-linked genes that could affect the OS prognosis, from the TARGET database. All the 85 TARGET samples were categorized into the training and validation sets, in the 7 : 3 ratio. For improving clinical detection, the researchers also used the Lasso regression and the Akaike Information Criterion (AIC) for reducing the number of genes that could be included in the model. The researchers noted that the Lasso regression [10] offered a better and refined model as it helped in building a penalty function that allowed the researchers to compress a few coefficients and set the value of particular coefficients to 0. As a result, it benefitted from subset contraction. It was seen to be a biased estimator that could handle complex data collinearity. It may effectively implement variable selection while calculating the parameters and could address the multicollinearity issue during regression analysis. To fit the number of parameters, the researchers have used the AIC.

The MASS package's stepAIC technique begins with the most difficult model and gradually eliminates variables to lower AIC. The model showed a better performance when the AIC value was lower. It demonstrates that this model showed a satisfactory degree of fitting with lesser parameters. Finally, the researchers used the survival analysis and ROC curves in the training set for assessing the model performance.

2.5. Verification of the Risk Model

The researchers tested the risk model with the help of varying data sets, as follows: (1) the ROC curve was constructed using the pROC package to assess the prognosis model's performance, and (2) the survival curve was generated using the Survminer program to assess the prognosis model's capacity to differentiate between the high- and low-risk patients.

2.6. Relationship between RiskScore and Pathway

The researchers used the R software GSVA package for determining the correlation between the RiskScore values of various samples and their bioactivities. For this purpose, they carried out the ssGSEA analysis for determining the ssGSEA scores of various biological functions for every sample. Then, they determined the correlation between all biological functions and the risk scores and selected the functions showing a correlation >0.35.

2.7. RiskScores and the Clinical Features for Constructing the Forest Map

The statistical results of several study components can be easily and intuitively displayed on the forest map. A valid line perpendicular to the X-axis (often at coordinates x = 1 or 0) is taken as the center in the standard form of a forest map, like that in a plane rectangular coordinate system. The magnitude of the effect and its 95% confidence interval are shown for each study as a number of line segments parallel to the X-axis. RiskScore and the associated clinical factors were assessed by Univariate and Multivariate Cox Regression analyses and displayed by forest map to establish the model's independence in clinical applications and for integrating the clinical information.

2.8. Statistical Analyses and Testing of the Proposed Hypotheses

The statistical analysis technique in R 3.6 provides a foundation for all statistical comparisons used in this study, as well as for testing the hypothesis that the groups displayed a statistically significant difference.

3. Results

3.1. Molecular Typing Based on Aging-Related Genes

The researchers used the Univariate Cox analysis on TARGET expression profile data encompassing 302 aging genes and identified 91 prognosis-related genes (Supplement Table 2). Consensus Clustering analysis was carried out using the gene expression profile linked to prognosis. The ideal number of clusters is two, as can be seen from the CDF diagram (Figure 2(b)). At the same time, it is clear from the consistency matrix's heat map that the sample's clustering performance when k = 2 is quite favorable (Figure 2(a)). Heat maps were used to display the clustering results. When compared to Cluster 2, Cluster 1 showed significantly higher expression of most of the prognosis-related genes. Additionally, Figure 2(c) showed that the samples for the patients who had expired were concentrated in the Cluster 2 subtype. The researchers plotted the survival curves for the two molecular subtypes. The findings demonstrated that Cluster 1 patients had a statistically and significantly better prognosis than Cluster 2 patients (Figure 2(d), P 0.001).

Table 2.

Sample grouping information.

Clinical features TCGA-all TCGA-test TCGA-train P value
Age
≤15 46 12 34 0.813
>15 39 12 27

Gender
Female 37 11 26 0.979
Male 48 13 35

OS
0 56 17 39 0.726
1 29 7 22

Figure 2.

Figure 2

(a) Consistency matrix heat map if k = 2. (b) Cumulative distribution of the Cluster consistency. (c) Cluster heat map of the prognosis-linked genes. (d) KM plots for the OS of the subgroup patients retrieved from target.

3.2. Comparative Analysis of the Immune Scores and the Matrix Scores between Both the Molecular Subtypes

The analysis outcomes for three R packets are displayed in the violin diagram. According to the ssGSEA results, the immune scores in Cluster 1 are statistically higher compared to Cluster 2, in terms of central memory CD8 T cells, activated B cells, central memory CD4 T cells, regulatory T cells, Type 1 T helper cells, activated dendritic cells, macrophages, CD56 bright natural killer cells, and MDSC (Figure 3(a)). The Estimate results indicated that the immune scores in Cluster 1 derived from the StromalScore and ESTIMATEScore were seen to be significantly higher compared to those displayed in Cluster 2 (Figure 3(b)). According to MCPcounter data, Cluster 1 had an immunological score that was statistically greater than Cluster 2 in terms of T cells, monocytic lineage, B lineage, CD8 T cells, cytotoxic lymphocytes, neutrophils, myeloid dendritic cells, endothelial cells, and fibroblasts (Figure 3(c)). Figure 3(d) displays the heat maps for the three immunological scores.

Figure 3.

Figure 3

(a) Comparison of ssGSEA immune scores between different molecular subtypes. (b) Comparison of the calculated immune scores between different molecular subtypes. (c) Comparison of the MCPcounter immune scores noted between various molecular subtypes. (d) Heat map comparison of the immune scores using 3 immune software packages between different molecular subtypes.

3.3. Designing a Prognostic Risk Model That Was Based on the Aging-Related Genes

3.3.1. Randomly Grouping the Samples

Keep the aging-related gene expression profiles from the TARGET dataset that affects prognosis. The 85 TARGET samples were separated into the training and validation sets, in a 7 : 3 ratio, and all samples were assessed using the Chi-square test of clinically relevant indicators. The training and the validation sets did not display any significant differences in the values of variables such as OS, age, or gender. The results are shown in Table 2.

3.3.2. Training Set Univariate Cox and Multivariate Cox Risk Analysis

Each aging-related gene and the survival data were examined using Univariate Cox analysis on the training set data. 34 genes with a significant difference were obtained using the R package survival coxph function, with p < 0.05 set as a filtering criterion (Supplement Table 3). It is vital to reduce the range of aging-linked genes while keeping a high level of accuracy since the vast number of genes makes clinical detection difficult. These 34 genes were analyzed using the Lasso Cox regression analysis and the Akaike Information Criteria (AIC) using the R software glmnet package to further minimize the no. of genes in this risk model. Figure 4(a) displays the changing track for every independent variable. It is clear that if the lambda value steadily increased, there is a similar increase in the no. of independent variable coefficients tending toward 0. Tenfold cross-validation is used to construct the model. Figure 4(b) illustrates the analysis of the confidence intervals under every lambda. The model performs best when lambda = 0.1306079, as shown in the figure. Five genes are chosen as the target genes for additional investigation when lambda = 0.1306079. These five genes are STAT5A, GPX4, ERCC4, EPS8, and TERT. Figure 5 displays the prognostic KM curve for the five genes. These five genes were used for significantly dividing the TARGET training set samples between the high and low-risk groups (p < 0.05). TERT is strongly expressed in high-risk groups, while GPX4, ERCC4, EPS8, STAT5A, and other important genes are expressed at low levels in high-risk groups, depending on the comparison of expression levels of these genes in these groups (Figure 6).

Figure 4.

Figure 4

(a) Change track of every independent variable, where the X-axis denotes the log value of an independent variable (lambda), while the Y-axis denotes the coefficient of an independent variable. (b) The confidence interval included under every lambda.

Figure 5.

Figure 5

KM curves of 5 genes derived from the TARGET training set.

Figure 6.

Figure 6

Expression levels of the above-mentioned 5 genes that were categorized into the high- and low-risk groups.

3.3.3. Construction and Evaluation of the Risk Model

The following risk model scoring formula was used for the above-mentioned five genes:

RiskScore=TERT×0.12GPX4×0.03ERCC4×0.26EPS8×0.019STAT5A×0.1. (1)

Here, the researchers calculated the RiskScore for every sample depending on the expression of 5 genes and then plotted the RiskScore distribution of all samples as described in Figure 7(a). The results presented in the figure showed that the OS of samples with a high RiskScore is smaller compared to the OS of samples with a low RiskScore, implying that the samples having a high RiskScore display a bad prognosis. The researchers further analyzed the expression variations of the 5 genes, based on their increase in the risk values, and noted that the high TERT expression was related to high risk, which was a risk factor. Additionally, ROC analysis on the RiskScore prognostic classification was carried out using the R software package of time ROC. Figure 7(b) illustrates the respective classification effectiveness of 2-, 3-, and 5-year prognosis prediction. It is clear from this figure that the new risk model has a significant Area Under the Curve (i.e., AUC) value. The Risk score was then treated using the z-score, and the samples with z-score values > 0 were categorized into the high-risk group, while samples with z-score values < 0 were categorized into the low-risk group. Thus, 38 samples were placed in the low-risk group, while 23 samples were placed in the high-risk group. The KM curves (Figure 7(c)) indicated that both the risk groups displayed a significantly different prognosis (p < 0.001).

Figure 7.

Figure 7

(a) RiskScore, survival status, survival time, and the expression of 5 genes retrieved from the TARGET training set. (b) ROC curve and the AUC of the novel 5-gene signature. (c) Distribution of the KM survival curves for the 5-gene signature included in the TARGET training set.

3.4. Verification of the Risk Model

3.4.1. Internal Data Sets to Verify the Robustness of This 5-Gene Signature

The researchers utilized the same model and the coefficients as the training dataset in the TARGET validation set and for all data sets to estimate the reliability of the constructed risk model. They determined the RiskScore for each sample and plotted the sample's RiskScore distribution. Figure 8(a) depicts the distribution of RiskScores for the TARGET verification set. The results presented in the figure indicate that the OS of samples with a high RiskScore is shorter compared to that displayed by the samples with a low RiskScore, indicating that the samples having a high RiskScore show a worse prognosis. The researchers further analyzed the expression variations of the 5 genes, based on their increase in the risk values, and noted that the high TERT expression was related to high risk, which was a risk factor. On the other hand, the higher expression of ERCC4, GPX4, EPS8, and STAT5A was seen to be associated with low risk, which acted as a protective factor. These results were similar to those displayed by the samples in the TARGET training set. Additionally, ROC analysis on the RiskScore prognostic categorization was carried out using the R software package time ROC. Figure 8(b) illustrates the respective classification effectiveness of 2-, 3-, and 5-year prognosis prediction. The RiskScore was then treated using the z-score, and samples having z-score values > 0 were categorized into the high-risk group, while samples with z-score values < 0 were categorized into the low-risk group. Thus, 15 samples were placed in the low-risk group, while 9 samples were placed in the high-risk group. The KM curves (Figure 8(c)) indicated that both the risk groups showed a significantly different prognosis (p < 0.001).

Figure 8.

Figure 8

(a) RiskScore, survival status, survival time, and the expression of 5 genes retrieved from the TARGET test set. (b) ROC curve and AUC of the novel 5-gene signature. (c) Distribution of KM survival curves for the 5-gene signature included in the TARGET test set.

Figure 9(a) presents the RiskScore distribution of all samples in the TARGET datasets. The results presented in the figure indicated that the OS of the samples with high RiskScore is shorter compared to that displayed by the samples with a low RiskScore, indicating that the samples having a high RiskScore show a worse prognosis. The researchers further analyzed the expression variations of the 5 genes, based on their increase in the risk values, and noted that the high TERT expression was related to high risk, which was a risk factor. On the other hand, the higher expression of ERCC4, GPX4, EPS8, and STAT5A was seen to be associated with low risk, which acted as a protective factor. These results were similar to those displayed by the samples in the TARGET training set. Additionally, ROC analysis on the RiskScore prognostic classification was carried out using the R software package time ROC. Figure 9(b) illustrates the respective classification effectiveness of 2-, 3-, and 5-year prognosis prediction. The results indicated that the risk model showed a higher AUC value. The RiskScore was then treated using the z-score, and samples with z-score values > 0 were categorized into the high-risk group, while samples with z-score values < 0 were categorized into the low-risk group. Thus, 51 samples were placed in the low-risk group, while 34 samples were placed in the high-risk group. The KM curves (Figure 9(c)) indicated that both the risk groups showed a significantly different prognosis (p < 0.001).

Figure 9.

Figure 9

(a) RiskScore, survival status, survival time, and expression of 5 genes retrieved from all TARGET datasets. (b) ROC curve and AUC of the novel 5-gene signature. (c) Distribution of KM survival curves for the 5-gene signature included in the TARGET datasets.

3.4.2. External Data Sets to Verify the Reliability and Robust Nature of the 5-Gene Signature

For analyzing the external independent verification dataset, i.e., GSE21257, the researchers used the same newly constructed risk model and coefficients as used in the training set for estimating the RiskScore values of every sample. They plotted the RiskScore distribution of these samples in Figure 10(a). The results presented in the figure indicated that the OS of the samples with high RiskScore is shorter compared to that displayed by the samples with a low RiskScore, indicating that the samples having a high RiskScore show a worse prognosis. The researchers further analyzed the expression variations of the 5 genes, based on their increase in the risk values, and noted that the high TERT expression was related to high risk, which was a risk factor. On the other hand, the higher expression of ERCC4, GPX4, EPS8, and STAT5A was seen to be associated with low risk, which acted as a protective factor. These results were similar to those displayed by the samples in the TARGET training set. Additionally, ROC analysis on the RiskScore prognostic classification was carried out using the R software package time ROC. Figure 10(b) illustrates the respective classification effectiveness of 2-, 3-, and 5-year prognosis prediction. The results indicated that the risk model showed a higher AUC. The RiskScore was then treated using the z-score, and samples having z-score values > 0 were categorized into the high-risk group, while samples with z-score values < 0 were categorized into the low-risk group. Thus, 25 samples were placed in the low-risk group, while 28 samples were placed in the high-risk group. The KM curves (Figure 10(c)) indicated that both the risk groups showed a significantly different prognosis (p < 0.001).

Figure 10.

Figure 10

(a) RiskScore, survival status, survival time, and expression of 5 genes retrieved from the independent validation data set GSE21257. (b) ROC curve and the novel AUC of the 5-gene signature. (c) Distribution of the KM survival curves for the 5-gene signature included in the independent validation data set GSE21257.

3.5. Risk Model and Prognosis Analysis of the Clinical Characteristics Displayed by the Samples

The researchers carried out the survival analysis of both the risk groups, based on their RiskScore values, in the samples that were categorized using different clinical features. The results showed that the novel 5-gene signature could significantly differentiate between the age and the gender of all patients categorized into the high-risk and low-risk groups, respectively (Figure 11, (p) < 0.01). The results also indicated that the developed risk model displayed a good predictive ability even if the samples displayed differential clinical characteristics.

Figure 11.

Figure 11

(a) KM curves for the high- and the low-risk groups that included patients more than 15 years of age. (b) KM curves for the high-risk and low-risk groups that included patients less than 15 years of age. (c) KM curves for the high- and low-risk groups that included female patients. (d) KM curves for the high- and low-risk groups that included male patients.

3.6. Relationship between the RiskScores and Pathway

While analyzing the correlation between the RiskScore values and the biological functions, the researchers noted that 24 KEGG pathways showed a negative correlation with the RiskScore value of the samples, whereas 1 KEGG pathway was positively correlated with the RiskScore value (Figure 12(a)). Then, they selected these KEGG pathways to carry out a Cluster analysis based on their different enrichment scores. Figure 12(b) presents the results of this analysis, and it was noted that out of the 25 pathways, the KEGG_JAK_STAT_SIGNALING_PATHWAY, KEGG_NATURAL_KILLER_CELL_MEDIATED_CYTOTOXICITY, KEGG_NATURAL_KILLER_CELL_MEDIATED_CYTOTOXICITY, KEGG_TOLL_LIKE_RECEPTOR_SIGNALING_PATHWAY, KEGG_CYTOKINE_CYTOKINE_RECEPTOR_INTERACTION, and a few other pathways got suppressed when the RiskScore value increased.

Figure 12.

Figure 12

(a) Clustering of the correlation coefficients noted between the RiskScore and the KEGG pathway, where the RiskScore correlation was >0.35. (b) Variations in the ssGSEA score of the KEGG pathway having a correlation <0.35 in every sample with increasing RiskScore. The X-axis denotes the sample, where the RiskScore value increased from left to right.

3.7. Relationship between RiskScore Values and the Clinical Characteristics of the Patients

For determining the robustness of the new 5-gene signature model during clinical applications, the researchers used the complete clinical data presented in the TARGET dataset for Univariate and Multivariate Cox Regression analyses. They displayed the results using a forest map. The forest map showed a RiskScore value of HR = 5.45, 95%CI = 2.86–10.41, p < 0.001 (Figure 13(a)) during Univariate analysis, while the RiskScore value during Multivariate analysis was HR = 5.24, 95%CI = 2.74–10.03, pp < 0.001 (Figure 13(b)). The results proved that the newly developed 5-gene signature showed a good prediction performance during clinical applications.

Figure 13.

Figure 13

(a) Univariate analysis based on the newly constructed 5-gene signature, age, and gender of the patients. (b) Multivariate analysis based on the novel 5-gene signature, age, and gender of the patients included in the study.

3.8. Comparison between the Risk Model and Other Models

After reviewing all the literature, the researchers selected 2 prognosis-based risk models: a 3-gene signature [11] and a 7-gene signature [12], for comparing the performance of the newly constructed 5-gene signature. For ensuring a fair model comparison, they determined the RiskScore of every OS sample included in the TARGET database, using a single technique, based on the analogous genes included in the models. The RiskScore was then treated using the z-score, and samples with z-score values > 0 were categorized into the high-risk group, while samples with z-score values < 0 were categorized into the low-risk group. The researchers then estimated the OS prognosis difference between both groups. Figure 14 and Figure 15 present the ROC and the OS-KM curves for both models, respectively. The results indicated that both the models showed lower AUC values, for the 2-, 3- or 5-years, compared to our model. Our model used a rational gene number and displayed a better performance. Additionally, it was noted that the 2 models could also effectively differentiate between both the risk group samples (p < 0.001).

Figure 14.

Figure 14

(a) RiskScore, survival status, survival time, and expression of 3 genes retrieved from all the TARGET datasets. (b) ROC curve and AUC of the 3-gene signature. (c) Distribution of KM survival curves for the 3-gene signature included in all the TARGET datasets.

Figure 15.

Figure 15

(a) RiskScore, survival status, survival time, and expression of 7 genes retrieved from all the TARGET datasets. (b) ROC curve and AUC of the 7-gene signature. (c) Distribution of KM survival curves for the 7-gene signature included in all the TARGET datasets.

4. Discussion

The irreversible stall in cell division is known as cell senescence. Typically, under duress or with time, the cell cycle and DNA replication slow down, while the normal physiological functions and the cell proliferative capacity deteriorate, and these changes are accompanied by morphological and functional changes, abnormalities in metabolism, and a deterioration of the immune system [13]. Aging is characterized molecularly by the accumulation of gene mutations, epigenetic alterations, aberrant mitochondrial function, decreased expression of cell cycle regulators, higher expression of cell cycle inhibitors and aging-related genes, decreased efficiency of DNA, RNA, and protein synthesis, and suppressed expression of genes involved in DNA damage repair [14]. Tumorigenesis and aging are closely related, as they promote and influence one another. Tumorigenesis is a natural outcome of aging to some extent, and aging is a significant risk factor for tumorigenesis [15].

Currently, the free radical theory [16] and telomere theory are the two most widely accepted theories of aging. Eukaryotic cells have a unique structure called a telomere at the end of their chromosomes that can build telomere DNA using its internal RNA as a template to preserve telomere length and allow unrestricted cell division [17]. Recent research has shown that only germ cells and hematopoietic stem cells display telomerase activity. On the other hand, 85–90% of the tumor cells have telomerase activity, indicating that telomerase and tumor development are closely linked. Osteosarcoma is a very prevalent and malignant form of bone tumor that occurs in adolescents and young children. Its prevalence has increased in the past few years. The prognosis is really poor, the degree of malignancy is very high, and the development is quicker. Earlier studies have noted that telomerase is overexpressed in malignant bone tumors rather than benign bone tumors, indicating a bad prognosis [19]. Following telomere inhibitor therapy, the telomere length in osteosarcoma drastically decreased and telomerase activity reduced [20]. As a result, osteosarcoma incidence and development are tightly linked to genes associated with aging.

In this research, we initially classified osteosarcoma into two subgroups based on genes associated with aging and its prognosis. Cluster 1 showed a significantly better disease prognosis than Cluster 2. It has been discovered that as we age, immune cells become less responsive to antigen stimulation, and the body's immune defenses become less effective, which can initiate the onset and development of tumors. As a result, we examined the infiltration of immune cells between various subtypes. The results revealed that the immunological microenvironment varied significantly between the two subtypes, which may account for the difference in survival rates between the patients in these two subtypes. Then, a risk model based on genes associated with aging and prognosis was created, and finally, a 5-gene signature containing TERT, GPX4, ERCC4, EPS8, and STAT5A was generated. Telomere Reverse Transcriptase (TERT) was a catalytic subunit of telomerase that shows a biological activity. It is crucial for maintaining the telomere length of telomerase, which allows eukaryotic cells to grow indefinitely [21]. TERT promoter mutations are linked to higher mRNA expression and telomerase activity in a range of tumors, according to studies [22]. For instance, TERT promoter mutations increase the expression of the TERT gene, and gene polymorphisms are linked to prostate cancer invasion and a bad prognosis [23, 24]. Numerous malignant cancers, including thyroid carcinoma, head and neck cancer, cervical cancer, and urothelial carcinoma, have been linked to elevated TERT expression [2528]. Osteosarcoma cells that are resistant to cisplatin exhibit high levels of TERT expression. During cisplatin treatment of the osteosarcoma cells, the TERT is transported from the nucleus to the mitochondrial cells and is subjected to cisplatin treatment [29]. The prognosis of osteosarcoma patients can be predicted using a 6-gene signature (including TERT) that is constructed based on the apoptosis-linked genes [30]. Glutathione peroxidase 4 (GPX4) is commonly considered a useful indicator of iron death and is crucial for maintaining oxidative homeostasis. The GPX4 protein is responsible for removing lipid peroxide. Lipid peroxide breaks the oxidation balance after GPX4 inactivation, disrupts membrane integrity, and induces iron death [31]. GPX4 acts as an oncogene and is highly expressed in many malignant tumors [3234]. Since tumor samples have low methylation in the GPX4 promoter region, low methylation and high histone acetylation may lead to the overexpression of GPX4 in tumor cells. Reduced GPX4 protein levels in osteosarcoma result in iron death and enhanced cisplatin sensitivity [36]. ERCC4 is a very crucial molecule involved in Nucleotide Excision Repair (NER). Breast cancer risk is enhanced by the Rs13181 polymorphism of the ERCC4 gene [37]. ERCC4 is linked to the onset or progression of bladder cancer [38], gastric cancer [39], oral cancer [40], and colorectal cancer [41]. Earlier studies have shown that the ERCC4 gene is differentially expressed in osteosarcoma and normal tissues [42]. It is also seen that the concentration of ERCC4 mRNA in peripheral blood cells is correlated with the response of osteosarcoma to chemotherapy [43]. One of the critical kinase-active substrates of the Epidermal Growth Factor Receptor (EGFR) is called Epidermal growth factor receptor Pathway Substrate 8 (EPS8). EPS8 is a signal molecule that regulates many signaling pathways and biological activities in the cells. It also helps in maintaining cell proliferation, differentiation, and survival. The overexpression of EPS8 can be noted in many types of cancers like pancreatic, colorectal cancer, oral squamous cell carcinoma, esophageal squamous cell carcinoma, adenocarcinoma, cervical squamous cell carcinoma, etc., as it was seen to be closely related to tumor occurrence, progression, invasion, and sensitivity to chemotherapy [4447]. One of the STAT5 subtypes is known as Signal Transducer and Activator of Transcription 5A (STAT5A). Nasopharyngeal carcinoma, breast cancer, prostate cancer, leukemia, and other malignancies can occur owing to abnormal STAT5 activation and overexpression [48, 49]. Since overexpression of the activated STAT5A promotes cell cycle progression whereas STAT5A inactivation inhibits it, researchers concluded that STAT5A could be crucial in the development of tumors [50]. Patients with osteosarcoma with a low STAT5A expression show a poor prognosis [51]. In the past, none of the researchers have thoroughly studied the five genes involved in osteosarcoma. Future investigation is necessary to confirm the mechanism of the five genes that contribute to the onset and progression of osteosarcoma. The researchers used the newly constructed risk model for estimating the risk scores for every sample, and then depending on their risk score values, they categorized the samples into high- and low-risk groups. The high-risk group patients showed a poor prognosis compared to the low-risk group patients. They also determined the robustness and reliability of their risk model using the internal and external validation sets. The results indicated that the risk model had a good performance. In comparison to the osteosarcoma risk models published in the past, the 5-gene signature risk model developed in this study showed a better risk prediction.

Nevertheless, there are still some limitations to this research. First, we only applied osteosarcoma samples and could not carry out strict intergroup condition control, which may lead to deviation and lack of verification of real clinical data. Secondly, the protein level and specific biological function were not verified. Third, the role of the relevant signal pathways screened in osteosarcoma is not clear. Fourth, the small sample size in the current research necessitates more research to reinforce and confirm the stability of the risk model. Further research and trials in the field of molecular biology are warranted.

5. Conclusions

The 5-gene risk model that was constructed using the aging-linked genes could precisely predict the prognosis of osteosarcoma patients and assist in making proper clinical decisions.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon reasonable request.

Conflicts of Interest

The authors declare that there are no conflicts of interest.

Authors' Contributions

D. C. L designed the study and performed data analysis, and S. Y. D. wrote the manuscript. Q. B. N and Z. Y. performed data collection, and L. T. supervised the manuscript. The authors have read and approved by all named authors.

Supplementary Materials

Supplementary Materials

Supplement Table 1: TARGET-OS dataset including 85 samples, comprising 302 genes. Supplement Table 2: 91 prognosis-related genes using TARGET expression profile data. Supplement Table 3: 34 aging-related genes with a significant difference.

References

  • 1.Smrke A., Anderson P. M., Gulia A., Gennatas S., Huang P. H., Jones R. L. Future directions in the treatment of osteosarcoma. Cells . 2021;10(1):p. 172. doi: 10.3390/cells10010172. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Bozorgi A., Sabouri L. Osteosarcoma, personalized medicine, and tissue engineering; an overview of overlapping fields of research. Cancer Treatment and Research Communications . 2021;27 doi: 10.1016/j.ctarc.2021.100324.100324 [DOI] [PubMed] [Google Scholar]
  • 3.Hosseinzadeh P., DeVries C. A., Nielsen E., et al. Changes in the practice of pediatric orthopaedic surgeons over the past decade: analysis of the database of the American board of orthopaedic surgery. Journal of Pediatric Orthopaedics . 2018;38(8):e486–e489. doi: 10.1097/bpo.0000000000001214. [DOI] [PubMed] [Google Scholar]
  • 4.Faisham W. I., Mat Saad A. Z., Alsaigh L. N., et al. Prognostic factors and survival rate of osteosarcoma: a single-institution study. Asia-Pacific Journal of Clinical Oncology . 2017;13(2):e104–e110. doi: 10.1111/ajco.12346. [DOI] [PubMed] [Google Scholar]
  • 5.Saraf A. J., Fenger J. M., Roberts R. D. Osteosarcoma: accelerating progress makes for a hopeful future. Frontiers in Oncology . 2018;8:p. 4. doi: 10.3389/fonc.2018.00004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.You J., Dong R., Ying M., He Q., Cao J., Yang B. Cellular senescence and anti-cancer therapy. Current Drug Targets . 2019;20(7):705–715. doi: 10.2174/1389450120666181217100833. [DOI] [PubMed] [Google Scholar]
  • 7.Liu X., Mo W., Ye J., et al. Regulatory T cells trigger effector T cell DNA damage and senescence caused by metabolic competition. Nature Communications . 2018;9(1):p. 249. doi: 10.1038/s41467-017-02689-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Laberge R. M., Awad P., Campisi J., Desprez P. Y. Epithelial-mesenchymal transition induced by senescent fibroblasts. Cancer Microenvironment . 2012;5(1):39–44. doi: 10.1007/s12307-011-0069-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Charoentong P., Finotello F., Angelova M., et al. Pan-cancer immunogenomic analyses reveal genotype-immunophenotype relationships and predictors of response to checkpoint blockade. Cell Reports . 2017;18(1):248–262. doi: 10.1016/j.celrep.2016.12.019. [DOI] [PubMed] [Google Scholar]
  • 10.Tibshirani R. S. Regression shrinkage and selection via the lasso: a retrospective. Journal of the Royal Statistical Society. Series B . 1996;58 [Google Scholar]
  • 11.Shi Y., He R., Zhuang Z., et al. A risk signature-based on metastasis-associated genes to predict survival of patients with osteosarcoma. Journal of Cellular Biochemistry . 2020;121(7):3479–3490. doi: 10.1002/jcb.29622. [DOI] [PubMed] [Google Scholar]
  • 12.Yu Y., Zhang H., Ren T., et al. Development of a prognostic gene signature based on an immunogenomic infiltration analysis of osteosarcoma. Journal of Cellular and Molecular Medicine . 2020;24(19):11230–11242. doi: 10.1111/jcmm.15687. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Di Micco R., Krizhanovsky V., Baker D., d’Adda di Fagagna F. Cellular senescence in ageing: from mechanisms to therapeutic opportunities. Nature Reviews Molecular Cell Biology . 2021;22(2):75–95. doi: 10.1038/s41580-020-00314-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Hernandez-Segura A., Nehme J., Demaria M. Hallmarks of cellular senescence. Trends in Cell Biology . 2018;28(6):436–453. doi: 10.1016/j.tcb.2018.02.001. [DOI] [PubMed] [Google Scholar]
  • 15.Poropatich K., Fontanarosa J., Samant S., Sosman J. A., Zhang B. Cancer immunotherapies: are they as effective in the elderly? Drugs Aging . 2017;34(8):567–581. doi: 10.1007/s40266-017-0479-1. [DOI] [PubMed] [Google Scholar]
  • 16.Harman D. Aging: a theory based on free radical and radiation chemistry. Journal of Gerontology . 1956;11(3):298–300. doi: 10.1093/geronj/11.3.298. [DOI] [PubMed] [Google Scholar]
  • 17.Zainabadi K. A brief history of modern aging research. Experimental Gerontology . 2018;104:35–42. doi: 10.1016/j.exger.2018.01.018. [DOI] [PubMed] [Google Scholar]
  • 18.Mo Y., Gan Y., Song S., et al. Simultaneous targeting of telomeres and telomerase as a cancer therapeutic approach. Cancer Research . 2003;63(3):579–585. [PubMed] [Google Scholar]
  • 19.Sanders R. P., Drissi R., Billups C. A., Daw N. C., Valentine M. B., Dome J. S. Telomerase expression predicts unfavorable outcome in osteosarcoma. Journal of Clinical Oncology . 2004;22(18):3790–3797. doi: 10.1200/jco.2004.03.043. [DOI] [PubMed] [Google Scholar]
  • 20.Thompson P. A., Drissi R., Muscal J. A., et al. A phase I trial of imetelstat in children with refractory or recurrent solid tumors: a Children’s Oncology Group Phase I Consortium Study (ADVL1112) Clinical Cancer Research . 2013;19(23):6578–6584. doi: 10.1158/1078-0432.ccr-13-1117. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Zhang F., Cheng D., Wang S., Zhu J. Human specific regulation of the telomerase reverse transcriptase gene. Genes . 2016;7(7):p. 30. doi: 10.3390/genes7070030. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Panebianco F., Nikitski A. V., Nikiforova M. N., Nikiforov Y. E. Spectrum of TERT promoter mutations and mechanisms of activation in thyroid cancer. Cancer Medicine . 2019;8(13):5831–5839. doi: 10.1002/cam4.2467. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Kumari A., Srinivasan R., Vasishta R. K., Wig J. D. Positive regulation of human telomerase reverse transcriptase gene expression and telomerase activity by DNA methylation in pancreatic cancer. Annals of Surgical Oncology . 2009;16(4):1051–1059. doi: 10.1245/s10434-009-0333-8. [DOI] [PubMed] [Google Scholar]
  • 24.Stoehr R., Taubert H., Zinnall U., et al. Frequency of TERT promoter mutations in prostate cancer. Pathobiology . 2015;82(2):53–57. doi: 10.1159/000381903. [DOI] [PubMed] [Google Scholar]
  • 25.Li C., Wu S., Wang H., et al. The C228T mutation of TERT promoter frequently occurs in bladder cancer stem cells and contributes to tumorigenesis of bladder cancer. Oncotarget . 2015;6(23):19542–19551. doi: 10.18632/oncotarget.4295. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Hosen M. I., Forey N., Durand G., et al. Development of sensitive droplet digital PCR assays for detecting urinary TERT promoter mutations as non-invasive biomarkers for detection of urothelial cancer. Cancers . 2020;12(12) doi: 10.3390/cancers12123541.3541 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Wang J., Zhu X., Ying P., Zhu Y. PIF1 affects the proliferation and apoptosis of cervical cancer cells by influencing TERT. Cancer Management and Research . 2020;12:7827–7835. doi: 10.2147/cmar.s265336. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Arantes L. M. R. B., Cruvinel-Carloni A., de Carvalho A. C., et al. TERT promoter mutation C228T increases risk for tumor recurrence and death in head and neck cancer patients. Frontiers in Oncology . 2020;10:p. 1275. doi: 10.3389/fonc.2020.01275. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Zhang Z., Yu L., Dai G., et al. Telomerase reverse transcriptase promotes chemoresistance by suppressing cisplatin-dependent apoptosis in osteosarcoma cells. Scientific Reports . 2017;7(1):p. 7070. doi: 10.1038/s41598-017-07204-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Yang F., Zhang Y. Apoptosis-relatedgenes-based prognostic signature for osteosarcoma. Aging (Albany NY) . 2022;14(9):3813–3825. doi: 10.18632/aging.204042. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Yang W. S., SriRamaratnam R., Welsch M., et al. Regulation of ferroptotic cancer cell death by GPX4. Cell . 2014;156(1-2):317–331. doi: 10.1016/j.cell.2013.12.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Kinowaki Y., Kurata M., Ishibashi S., et al. Glutathione peroxidase 4 overexpression inhibits ROS-induced cell death in diffuse large B-cell lymphoma. Laboratory Investigation . 2018;98(5):609–619. doi: 10.1038/s41374-017-0008-1. [DOI] [PubMed] [Google Scholar]
  • 33.Guerriero E., Capone F., Accardo M., et al. GPX4 and GPX7 over-expression in human hepatocellular carcinoma tissues. European Journal of Histochemistry . 2015;59(4):p. 2540. doi: 10.4081/ejh.2015.2540. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Zhao H., Ji B., Chen J., Huang Q., Lu X. Gpx 4 is involved in the proliferation, migration and apoptosis of glioma cells. Pathology, Research & Practice . 2017;213(6):626–633. doi: 10.1016/j.prp.2017.04.025. [DOI] [PubMed] [Google Scholar]
  • 35.Zhang X., Sui S., Wang L., et al. Inhibition of tumor propellant glutathione peroxidase 4 induces ferroptosis in cancer cells and enhances anticancer effect of cisplatin. Journal of Cellular Physiology . 2020;235(4):3425–3437. doi: 10.1002/jcp.29232. [DOI] [PubMed] [Google Scholar]
  • 36.Liu Q., Wang K. The induction of ferroptosis by impairing STAT3/Nrf2/GPx4 signaling enhances the sensitivity of osteosarcoma cells to cisplatin. Cell Biology International . 2019;43(11):1245–1256. doi: 10.1002/cbin.11121. [DOI] [PubMed] [Google Scholar]
  • 37.Sahaba S. A., Rashid M. A., Islam M. S., et al. The link of ERCC2 rs13181 and ERCC4 rs2276466 polymorphisms with breast cancer in the Bangladeshi population. Molecular Biology Reports . 2022;49(3):1847–1856. doi: 10.1007/s11033-021-06994-7. [DOI] [PubMed] [Google Scholar]
  • 38.Qiu J., Wang X., Meng X., et al. Attenuated NER expressions of XPF and XPC associated with smoking are involved in the recurrence of bladder cancer. PLoS One . 2014;9(12) doi: 10.1371/journal.pone.0115224.e115224 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Li P., Ma Y. Correlation of xeroderma pigmentosum complementation group F expression with gastric cancer and prognosis. Oncology Letters . 2018;16(6):6971–6976. doi: 10.3892/ol.2018.9529. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Sa M. C., Conceicao T. S., de Moura Santos E., de Morais E. F., Galvao H. C., de Almeida Freitas R. Immunohistochemical expression of TFIIH and XPF in oral tongue squamous cell carcinoma. European Archives of Oto-Rhino-Laryngology . 2020;277(3):893–902. doi: 10.1007/s00405-019-05757-2. [DOI] [PubMed] [Google Scholar]
  • 41.Hu H., Liu S., Chu A., Chen J., Xing C., Jing J. Comprehensive analysis of ceRNA network of ERCC4 in colorectal cancer. PeerJ . 2021;9 doi: 10.7717/peerj.12647.e12647 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Rothzerg E., Xu J., Wood D., Koks S. 12 Survival-related differentially expressed genes based on the TARGET-osteosarcoma database. Experimental Biology and Medicine (Maywood, NJ, United States) . 2021;246(19):2072–2081. doi: 10.1177/15353702211007410. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Li X., Guo W., Shen Dh, Yang Rl, Liu J., Zhao H. [Expressions of ERCC2 and ERCC4 genes in osteosarcoma and peripheral blood lymphocytes and their clinical significance] Beijing Da Xue Xue Bao Yi Xue Ban . 2007;39(5):467–471. [PubMed] [Google Scholar]
  • 44.Welsch T., Endlich K., Giese T., Buchler M. W., Schmidt J. Eps8 is increased in pancreatic cancer and required for dynamic actin-based cell protrusions and intercellular cytoskeletal organization. Cancer Letters . 2007;255(2):205–218. doi: 10.1016/j.canlet.2007.04.008. [DOI] [PubMed] [Google Scholar]
  • 45.Yap L. F., Jenei V., Robinson C. M., et al. Upregulation of Eps8 in oral squamous cell carcinoma promotes cell migration and invasion through integrin-dependent Rac1 activation. Oncogene . 2009;28(27):2524–2534. doi: 10.1038/onc.2009.105. [DOI] [PubMed] [Google Scholar]
  • 46.Chen Y. J., Shen M. R., Chen Y. J., Maa M. C., Leu T. H. Eps8 decreases chemosensitivity and affects survival of cervical cancer patients. Molecular Cancer Therapeutics . 2008;7(6):1376–1385. doi: 10.1158/1535-7163.mct-07-2388. [DOI] [PubMed] [Google Scholar]
  • 47.Maa M. C., Lee J. C., Chen Y. J., et al. Eps8 facilitates cellular growth and motility of colon cancer cells by increasing the expression and activity of focal adhesion kinase. Journal of Biological Chemistry . 2007;282(27):19399–19409. doi: 10.1074/jbc.m610280200. [DOI] [PubMed] [Google Scholar]
  • 48.Surbek M., Tse W., Moriggl R., Han X. A centric view of JAK/STAT5 in intestinal homeostasis, infection, and inflammation. Cytokine . 2021;139 doi: 10.1016/j.cyto.2020.155392.155392 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Maranto C., Udhane V., Jia J., et al. Prospects for clinical development of Stat5 inhibitor IST5-002: high transcriptomic specificity in prostate cancer and low toxicity in vivo. Cancers . 2020;12(11):p. 3412. doi: 10.3390/cancers12113412. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Li Z., Chen C., Chen L., et al. STAT5a confers doxorubicin resistance to breast cancer by regulating ABCB1. Frontiers in Oncology . 2021;11 doi: 10.3389/fonc.2021.697950.697950 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Guo Z., Tang Y., Fu Y., Wang J. Decreased expression of STAT5A predicts poor prognosis in osteosarcoma. Pathology, Research & Practice . 2019;215(3):519–524. doi: 10.1016/j.prp.2019.01.008. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Materials

Supplement Table 1: TARGET-OS dataset including 85 samples, comprising 302 genes. Supplement Table 2: 91 prognosis-related genes using TARGET expression profile data. Supplement Table 3: 34 aging-related genes with a significant difference.

Data Availability Statement

The data used to support the findings of this study are available from the corresponding author upon reasonable request.


Articles from Journal of Oncology are provided here courtesy of Wiley

RESOURCES