A novel serum miRNA-pair classifier for diagnosis of sarcoma

Zheng Jin; Shanshan Liu; Pei Zhu; Mengyan Tang; Yuanxin Wang; Yuan Tian; Dong Li; Xun Zhu; Dongmei Yan; Zhenhua Zhu

doi:10.1371/journal.pone.0236097

. 2020 Jul 16;15(7):e0236097. doi: 10.1371/journal.pone.0236097

A novel serum miRNA-pair classifier for diagnosis of sarcoma

Zheng Jin ¹, Shanshan Liu ¹, Pei Zhu ¹, Mengyan Tang ¹, Yuanxin Wang ¹, Yuan Tian ², Dong Li ¹, Xun Zhu ¹, Dongmei Yan ^1,^*, Zhenhua Zhu ^3,^*

Editor: David M Loeb⁴

PMCID: PMC7365454 PMID: 32673360

Abstract

Soft tissue sarcomas (STS) is a set of rare malignant tumor originated from mesoderm. For the prognosis of sarcoma, early diagnosis is important, however, currently no mature and non-invasive method for diagnosis exists. MicroRNAs (miRNAs) are a class of noncoding RNAs and their expression varies greatly, especially during tumor activity. The purpose of this study was to construct a predictive model for the diagnosis of sarcomas based on the relative expression level of miRNA in serum. miRNA array expression data of 677 samples including 402 malignant sarcoma samples and 275 healthy samples was used to construct the prediction model. Based on 6 gene pairs, random generalized linear model (RGLM) was constructed, with an accuracy of 100% in the internal test dataset and of 74.3% in the merged external dataset in prediction whether a serum sample was obtained from a sarcoma patient, with a specificity of 100% in the internal test dataset and 90.5% in the external dataset. In conclusion, our serum miRNA-pair classifier has the potential to be used for the screening of sarcoma with high accuracy and specificity.

Introduction

In general, sarcomas are divided into bone and soft tissue sarcomas, both of which have many subtypes [1]. Advances in adjuvant chemotherapy and surgical techniques have provided additional options for the treatment of sarcomas. However, high-grade sarcomas are prone to recurrence and metastasis and once metastasized, the death rate can increase to 50% [2]. CT and MRI imaging and biopsy are routine diagnostic methods for sarcomas, but the high costs and invasive nature of the approach are not conducive to the screening of sarcomas. Development of simple and convenient screening method is of critical importance for the treatment and prognosis of sarcomas.

MicroRNAs (miRNA) are short (∼22 nucleotides) non-coding RNA molecules that regulate gene expression at the post-transcriptional level, and have critical functions across various biological processes [3]. Furthermore, miRNAs showed higher accuracy than messenger RNA in classifying poorly differentiated tumors in a study of 334 samples [4]. Certain subsets of miRNAs are secreted from cancer cells into the extracellular space via multiple mechanisms, such as microvesicle-mediated pathways [5, 6]. In light of these biological features of miRNAs, extracellular miRNAs also known as circulating miRNAs showed potential to be used as diagnostic markers for a variety of tumors [7]. Indeed, in a recent study by Asano et al [8], it was shown that the expression level of 7 miRNAs exhibited good performance to determine whether serum samples were from sarcoma patients or healthy donors as indicated by an area under the ROC (receiver operating characteristic) curve (AUC) value of 98%. Although these findings indicate significant progress, due to differences in miRNA detection methods, the accuracy of prediction models that are based on miRNA expression levels may fail. Gene pairs based on relative expression level, eliminating a batch effect may be an alternative choice [9].

In this study, we constructed gene pairs according to the relative expression levels of miRNAs, and identified a novel miRNA pairs-based classifier in the screening of sarcoma.

Materials and methods

Data source

Data processing steps are shown in the flow chart (Fig 1). Gene Expression Omnibus (GEO) and ArrayExpress, the two largest sequencing data platforms were searched for data retrieval. Datasets should fulfill the following criterial to be adopted: 1. must include miRNA array or sequencing data of serum samples; 2. serum must be taken from sarcoma patients or healthy volunteers.

Screening miRNAs and constructing gene pairs

For screening miRNAs and constructing gene pairs, we first screened miRNAs and selected miRNAs with sufficient expression in sarcoma. In this study, missing expression values were filled using K-nearest neighbors (KNN) algorithm. Then, miRNAs with an expression higher than 8 (log2 scale) in half of the samples of GSE124158 were selected. After that, samples from GSE124158 including malignant sarcoma patients and healthy subjects were randomly allocated to training group and test group at a ratio of 3:1. In the training dataset, t-test was used to test the statistical significance of each miRNA between healthy and sarcoma samples. miRNAs with p value less than 0.05 and effect value ranked in the top 250 were selected as candidate miRNAs. The expression level of candidate miRNAs underwent pairwise comparison to generate a score for each gene pair. If the first gene (G1) of a gene pair was smaller than the second in a single sample, then the value of this gene pair in this sample was set to 1, in other cases, it was set to 0. According to the above rules, we constructed the gene pairs—samples matrix. To ensure the prediction efficiency of the model, gene pairs which had a 1(or 0) in most samples (>90%) of training dataset were removed. Then, in the training dataset t-test was used to test the statistical significance of each gene pair between healthy and sarcoma samples in the training dataset. miRNAs with p value less than 0.05 and effect value ranked in the top 80 were selected as candidate gene pairs for prediction model construction.

Classifier construction and validation

Random generalized linear model (RGLM) is a highly accurate and interpretable ensemble predictor that shares the advantages of a random forest (excellent predictive accuracy, feature importance measures, out-of-bag estimates of accuracy) with those of a forward selected generalized linear model (interpretability) [10]. Variables were randomly selected into 100 bags, and in each bag, variables were filtered by correlation test and stepwise method. After that, generalized linear model (GLM) is constructed in each bag by using the filtered variables. When test dataset is predicted, the model will use the voting method to synthesize the prediction results of 100 independent GLM models to give the final prediction results. Using the screened gene pairs, RGLM was used to construct a prediction model that determined whether samples were healthy samples or sarcoma samples. Then, using “thinRGLM” function, gene pairs that occurred the most in the 100 GLMs were reserved and the thinned RGLM model was constructed based on that. Meanwhile, the reduction in prediction accuracy was negligible compared to the original RGLM model. Subsequently, the classifier was tested in the internal test dataset and external dataset. Currently, no biological diagnostic indexes for soft tissue sarcomas are commercially available. To test the predictive efficiency of this model, the model was compared with the original prediction model of Asano [8].

miRNA network and function enrichment analysis

Using the miRNet platform [11] (https://www.mirnet.ca/), miRNA targets were predicted and a correlated network was constructed. Using the online databse STRING (https://string-db.org/) [12], gene ontology (GO) function enrichment analysis was performed based on the predicted target mRNAs.

Statistical analysis

All the statistical analyses were performed using R version 3.6.3 (R Foundation for Statistical Computing, http://www.R-project.org) and associated packages. The caret package (v 6.0) was used to divide the samples into training/test partitions in GSE124158 at a ratio of 3:1 according to the type of samples. The DMwR package (v 0.4.1) was used to fill in NA values with the values of the nearest neighbors. T-test between groups was conducted using package genefilter (v 1.68.0). In all statistical analysis, p<0.05 was considered statistically significant. In this study, accuracy, specificity and sensitivity were used to evaluate the predictive effect of the model. $a c c u r a c y = \frac{T P + T N}{T P + T N + F P + F N}, s p e c i f i c i t y = \frac{T N}{F P + T N}, s e n s i t i v i t y = \frac{T P}{T P + F P}$ (TP: True positive; TN: True negative; FP: False positive; FN: False negative).

Results

Characteristics of datasets

A total of 677 serum samples, including 402 malignant sarcoma samples and 275 healthy controls and associated basic clinical information was downloaded from GEO, serial number GSE124158 [8]. In addition, external test data sets E-MTAB-3273 [13], E-MTAB-3888 and E-MTAB-5126, containing 10 synovial serum samples, 6 liposarcoma serum samples, 5 leiomyosarcoma serum samples, and 14 healthy controls were downloaded from the Array Express. Detailed clinical information of GSE124158 and merged external dataset are presented in Table 1 and S1 Table.

Table 1. Clinical information of dataset GSE4158.

Characteristics	Details
	Sarcoma	Health
Age (median±sd) (years)	48±22	51±12
Gender (N)
Male	244(60.7%)	150(54.5)
Female	158(39.3%)	125(45.6%)
Stage (N)
Stage I	36 (9.0%)
Stage II	125(31.1%)
Stage III	137(34.1%)
Stage IV	100(24.9%)
Unknown	4(1.0%)

Open in a new tab

Gene screening and the construction of gene pairs

At the threshold of 8 (log2 scale), 362 miRNAs were identified as abundant miRNAs. Using t-test, 250 miRNAs were screened out for construction of the gene pairs. The genes were paired to produce a total of 31,125 gene pairs. Next, gene pairs were filtered to remove gene pairs with consistent values of 0 or 1 in most of the samples (90%). Then using t-test, a total of 80 gene pairs were selected to construct the prediction model.

Prediction mode

In the RGLM prediction model, the gene pairs are lessened on the premise of predicting accuracy. Finally, a total of 6 gene pairs, containing 9 miRNAs were used in the final prediction model. Gene pairs included hsa-miR-378c, hsa-miR-383-3p, hsa-miR-454-5p, hsa-miR-4740-5p, hsa-miR-5007-3p, hsa-miR-380-5p, hsa-miR-499b-3p, hsa-miR-571 and hsa-miR-518a-3p, which are listed in Table 2. RGLM model showed 100% accuracy, specificity and sensitivity in predicting whether samples were healthy samples or sarcoma samples in the internal test dataset. The gene pair-based prediction model showed an accuracy of 74.3% in the outside test dataset with a specificity of 90.5% and a sensitivity of 50.0% (Table 3). In our prediction model, 9 samples were predicted as normal samples and 26 samples were predicted as STS samples. While, all the 35 samples were predicted to be healthy in Asna's model in the external dataset.

Table 2. Gene pairs for RGLM prediction model.

Gene pairs	Gene 1	Gene 2
1	hsa-miR-378c	hsa-miR-380-5p
2	hsa-miR-378c	hsa-miR-499b-3p
3	hsa-miR-383-3p	hsa-miR-571
4	hsa-miR-454-5p	hsa-miR-571
5	hsa-miR-4740-5p	hsa-miR-5007-3p
6	hsa-miR-5007-3p	hsa-miR-518a-3p

Open in a new tab

Table 3. Predictive accuracy of the gene pair classifier.

Datasets	Accuracy	Specificity	Sensitivity	PPV	NPV
Train dataset (N = 508)	99.8%	99.7%	100%	99.5%	100%
Internal test dataset (N = 169)	100%	100%	100%	100%	100%
External test dataset (N = 35)	74.3%	90.5%	50.0%	77.8%	73.1%

Open in a new tab

PPV positive predictive value, NPV negative predictive value

miRNA network and function enrichment

Five of the 9 miRNAs have experiment validated target information in the miRNet platform. Based on the miRNA-mRNA target information of miRNet, a miRNA network containing 247 nodes and 248 edges was constructed (Fig 2, S2 and S3 Tables). Four miRNAs (hsa-mir-378c, hsa-mir-454-5p, hsa-mir-571 and hsa-mir-499b-3p) linked with their respective targets and shared common targets. Function enrichment analysis revealed that targets mRNAs mainly play a role in metabolic process such as protein metabolic process, cellular macromolecule metabolic process, etc. (Fig 3). Furthermore, MAPK and FoxO signaling pathways were also significantly enriched (S4 Table).

Discussion

Traditional methods of detecting tumors are usually harmful to the human body. It is well known that miRNAs are involved during development and physiological processes, and their disorders may lead to the development of several diseases [14]. Because miRNAs can reflect pathological processes, they have been considered useful biomarkers for diagnosis and pathogenesis, as well as for classification of different cancer types [15].

With the development of high-throughput sequencing technology, an increase in genetic testing approaching has been applied in clinical medicine to diagnose or assess prognosis. For instance, Li et al developed an individualized immune signature, which can estimate the prognosis in patients with non-small cell lung cancer in an early stage [16]. In a previous study, a gene pairs-based prognostic signature has been used for estimating the prognosis of gastric cancer [17], and relapse-free survival of colorectal cancer [18]. In our study, we aimed to investigate the diagnostic ability of gene pairs for sarcomas.

Due to technical and algorithmic differences, models based on gene expression levels of one dataset may not be accurate to predict another dataset, which is especially true for datasets from different sequencing platforms [9]. Asano N et al developed an index based on gene expression levels that could distinguish sarcomas from benign or healthy samples [8]. In our models, the accuracy of the prediction was up to 100% in the internal test dataset. In addition, we applied our model to external datasets and achieved an accuracy of 74.3% with a specificity of 90.5%. In contrast, the prediction model “Index Ⅵ” which based on the expression level of 7 miRNAs failed to classify the samples. This may be caused by large differences in gene expression signals between different chip platforms. But using the idea of gene pairs can reduce this effect because there is no need to consider the specific expression level of genes.

As a preliminary exploration of the function of miRNAs in the prediction model, we constructed a miRNA-mRNA network using miRNA and their target information. Enrichment analysis revealed that the network may be related to metabolic processes, but further confirmation is needed.

Blood samples are easily obtained in physical examination, and further analysis of blood samples to identify tumor patients is of positive significance for the treatment and prognosis of tumor. Based on this, we built a classifier based on gene pairs, which showed good accuracy and specificity. Due to limited data, the reliability of this classifier needs to be tested in more samples.

Conclusions

In our study, we identified a novel gene pairs based classifier that is promising in the screening of sarcomas.

Supporting information

S1 Table. Clinical characteristics of the external datasets.

(DOCX)

Click here for additional data file.^{(17.7KB, docx)}

S2 Table. Nodes information of miRNA-mRNA network.

(DOCX)

Click here for additional data file.^{(22.7KB, docx)}

S3 Table. Edges information of miRNA-mRNA network.

(CSV)

Click here for additional data file.^{(16.9KB, csv)}

S4 Table. GO and KEGG enrichment items of miRNA network.

(XLSX)

Click here for additional data file.^{(52.4KB, xlsx)}

S1 Codes. Codes and original data for reproducing the results in this study.

(ZIP)

Click here for additional data file.^{(1.5MB, zip)}

Data Availability

Data are publicly available from the Gene Expression Omnibus (GEO) at www.ncbi.nlm.nih.gov/geo/ (accession code GSE124158) and from the ArrayExpress respository at www.ebi.ac.uk/arrayexpress/ (accession codes E-MTAB-3273, E-MTAB-3888, E-MTAB-5126).

Funding Statement

The work was supported by the grants from the National Natural Science Foundation of China (No. 81571530 and 81871245).

References

1.Tobias J, Hochhauser D. Bone and soft-tissue sarcomas. Cancer and its Management 2014. p. 446–69. [Google Scholar]
2.Casali PG, Abecassis N, Aro HT, Bauer S, Biagini R, Bielack S, et al. Soft tissue and visceral sarcomas: ESMO-EURACAN Clinical Practice Guidelines for diagnosis, treatment and follow-up. Ann Oncol. 2018;29(Supplement_4):iv268–iv9. Epub 2018/10/05. 10.1093/annonc/mdy321 . [DOI] [PubMed] [Google Scholar]
3.de Planell-Saguer M, Rodicio MC. Analytical aspects of microRNA in diagnostics: a review. Analytica chimica acta. 2011;699(2):134–52. Epub 2011/06/28. 10.1016/j.aca.2011.05.025 . [DOI] [PubMed] [Google Scholar]
4.Lu J, Getz G, Miska EA, Alvarez-Saavedra E, Lamb J, Peck D, et al. MicroRNA expression profiles classify human cancers. Nature. 2005;435(7043):834–8. 10.1038/nature03702 . [DOI] [PubMed] [Google Scholar]
5.D'Souza-Schorey C, Clancy JW. Tumor-derived microvesicles: shedding light on novel microenvironment modulators and prospective cancer biomarkers. Genes Dev. 2012;26(12):1287–99. Epub 2012/06/21. 10.1101/gad.192351.112 [DOI] [PMC free article] [PubMed] [Google Scholar]
6.Kosaka N, Yoshioka Y, Fujita Y, Ochiya T. Versatile roles of extracellular vesicles in cancer. J Clin Invest. 2016;126(4):1163–72. Epub 2016/03/15. 10.1172/JCI81130 [DOI] [PMC free article] [PubMed] [Google Scholar]
7.Schwarzenbach H, Nishida N, Calin GA, Pantel K. Clinical relevance of circulating cell-free microRNAs in cancer. Nat Rev Clin Oncol. 2014;11(3):145–56. Epub 2014/02/05. 10.1038/nrclinonc.2014.5 . [DOI] [PubMed] [Google Scholar]
8.Asano N, Matsuzaki J, Ichikawa M, Kawauchi J, Takizawa S, Aoki Y, et al. A serum microRNA classifier for the diagnosis of sarcomas of various histological subtypes. Nat Commun. 2019;10. ARTN 1299 10.1038/s41467-019-09143-8. WOS:000461881700004. [DOI] [PMC free article] [PubMed] [Google Scholar]
9.Leek JT, Scharpf RB, Bravo HC, Simcha D, Langmead B, Johnson WE, et al. Tackling the widespread and critical impact of batch effects in high-throughput data. Nature reviews Genetics. 2010;11(10):733–9. Epub 2010/09/15. 10.1038/nrg2825 [DOI] [PMC free article] [PubMed] [Google Scholar]
10.Song L, Langfelder P, Horvath S. Random Generalized Linear Model: A Highly Accurate and Interpretable Ensemble Predictor. Bmc Bioinformatics. 2013;14(1):5. [DOI] [PMC free article] [PubMed] [Google Scholar]
11.Fan YN, Habib M, Xia JG. Xeno-miRNet: a comprehensive database and analytics platform to explore xeno-miRNAs and their potential targets. Peerj. 2018;6. ARTN e5650 10.7717/peerj.5650. WOS:000446948300010. [DOI] [PMC free article] [PubMed] [Google Scholar]
12.Szklarczyk D, Franceschini A, Wyder S, Forslund K, Heller D, Huerta-Cepas J, et al. STRING v10: protein-protein interaction networks, integrated over the tree of life. Nucleic Acids Res. 2015;43(Database issue):D447–52. Epub 2014/10/30. 10.1093/nar/gku1003 [DOI] [PMC free article] [PubMed] [Google Scholar]
13.Fricke A, Ullrich PV, Heinz J, Pfeifer D, Scholber J, Herget GW, et al. Identification of a blood-borne miRNA signature of synovial sarcoma. Mol Cancer. 2015;14:151 Epub 2015/08/08. 10.1186/s12943-015-0424-z [DOI] [PMC free article] [PubMed] [Google Scholar]
14.Hammond SM. An overview of microRNAs. Adv Drug Deliv Rev. 2015;87:3–14. Epub 2015/05/17. 10.1016/j.addr.2015.05.001 [DOI] [PMC free article] [PubMed] [Google Scholar]
15.Lu J, Getz G, Miska EA, Alvarez-Saavedra E, Lamb J, Peck D, et al. MicroRNA expression profiles classify human cancers. Nature. 2005;435(7043):834–8. Epub 2005/06/10. 10.1038/nature03702 . [DOI] [PubMed] [Google Scholar]
16.Li B, Cui Y, Diehn M, Li R. Development and Validation of an Individualized Immune Prognostic Signature in Early-Stage Nonsquamous Non-Small Cell Lung Cancer. JAMA oncology. 2017;3(11):1529–37. Epub 2017/07/09. 10.1001/jamaoncol.2017.1609 [DOI] [PMC free article] [PubMed] [Google Scholar]
17.Peng PL, Zhou XY, Yi GD, Chen PF, Wang F, Dong WG. Identification of a novel gene pairs signature in the prognosis of gastric cancer. Cancer medicine. 2018;7(2):344–50. Epub 2017/12/29. 10.1002/cam4.1303 [DOI] [PMC free article] [PubMed] [Google Scholar]
18.P S, J W, Y T, C X, Medicine ZXJ. Gene pair based prognostic signature for colorectal colon cancer. 2018;97(42):e12788. [DOI] [PMC free article] [PubMed] [Google Scholar]

PLoS One. doi: 10.1371/journal.pone.0236097.r001

Decision Letter 0

David M Loeb

5 Mar 2020

PONE-D-19-34131

Development and validation of individualized miRNA signatures for sarcoma

PLOS ONE

Dear Dr. Zhu,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

We would appreciate receiving your revised manuscript by Apr 19 2020 11:59PM. When you are ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter.

To enhance the reproducibility of your results, we recommend that if applicable you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols

Please include the following items when submitting your revised manuscript:

A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). This letter should be uploaded as separate file and labeled 'Response to Reviewers'.
A marked-up copy of your manuscript that highlights changes made to the original version. This file should be uploaded as separate file and labeled 'Revised Manuscript with Track Changes'.
An unmarked version of your revised paper without tracked changes. This file should be uploaded as separate file and labeled 'Manuscript'.

Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out.

We look forward to receiving your revised manuscript.

Kind regards,

David M Loeb

Academic Editor

PLOS ONE

Journal Requirements:

When submitting your revision, we need you to address these additional requirements:

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at http://www.plosone.org/attachments/PLOSOne_formatting_sample_main_body.pdf and http://www.plosone.org/attachments/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2. We note that you have indicated that data from this study are available upon request. PLOS only allows data to be available upon request if there are legal or ethical restrictions on sharing data publicly. For information on unacceptable data access restrictions, please see http://journals.plos.org/plosone/s/data-availability#loc-unacceptable-data-access-restrictions.

In your revised cover letter, please address the following prompts:

a) If there are ethical or legal restrictions on sharing a de-identified data set, please explain them in detail (e.g., data contain potentially identifying or sensitive patient information) and who has imposed them (e.g., an ethics committee). Please also provide contact information for a data access committee, ethics committee, or other institutional body to which data requests may be sent.

b) If there are no restrictions, please upload the minimal anonymized data set necessary to replicate your study findings as either Supporting Information files or to a stable, public repository and provide us with the relevant URLs, DOIs, or accession numbers. Please see http://www.bmj.com/content/340/bmj.c181.long for guidelines on how to de-identify and prepare clinical data for publication. For a list of acceptable repositories, please see http://journals.plos.org/plosone/s/data-availability#loc-recommended-repositories.

We will update your Data Availability statement on your behalf to reflect the information you provide.

3. Please include a copy of Table 1 and 2 which you refer to in your text on page 7.

4. Please include captions for your Supporting Information files at the end of your manuscript, and update any in-text citations to match accordingly. Please see our Supporting Information guidelines for more information: http://journals.plos.org/plosone/s/supporting-information.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Partly

Reviewer #2: Partly

Reviewer #3: Partly

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: I Don't Know

Reviewer #2: No

Reviewer #3: I Don't Know

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: No

Reviewer #2: Yes

Reviewer #3: Yes

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: In this paper by Jin et al, the authors evaluated circulating microRNA signatures from patients with soft tissue sarcomas (STS) in order to develop predictive models of diagnosis. The investigators evaluated more than 880 sarcoma samples and determined 13 gene pairs that could be used for diagnosis with an AUC of 100% in their initial set, but only 68% in the external validation set. Of note, they also determined that the miRNA gene pairs showed enrichment for functional roles in cell cycle regulation. They concluded that their serum miRNA pairs can be used for the diagnosis of STS with high degree of accuracy and sensitivity. Overall, these authors have developed a non-invasive and potentially clinically applicable prediction model using serum microRNA signatures. However, there are some concerns regarding their approaches and conclusions, which are mentioned below.

Lines 59-60: Please further explain exactly what is meant by the statement: “detecting the level of miRNAs in body fluids, the state of somatic tumor cells can be inferred..”

Lines 106-107: What does it mean for the miRNAs to be “paired in the order from front to back”?

It is still not clear how the miRNAs were paired. It would be helpful to more of the general audience to further explain this process.

Lines 159-160: How were the 96 gene pairs, from more than 11,0000 pairs, selected to construct the prediction model.

But then in lines 163-164 it is stated that a total of 13 gene pairs, containing 16 miRNAs were used. All of this process is very confusing and difficult to follow for the reader.

Lines 173-174: It would be informative to provide more information regarding the other models used for prediction development and the comparison with this particular model.

Finally, all of the figures are extremely difficult to review. The clarity is extremely poor.

Reviewer #2: In this manuscript, the authors describe a miRNA gene pairs-based index to differentiate patients who have bony lesions from healthy controls. The concept is interesting and such biomarker development has tremendous promise in the sarcoma field. In particular, it is intriguing that the predictive signature was consistent across all stages of disease. However, there are critical weaknesses that need to be addressed for this manuscript to be considered for publication.

Major critiques:

1. The patient cohort is comprised of tumors listed as benign, intermediate, and malignant, all of which (individually and collectively) can be distinguished from healthy controls. Benign bone lesions (40+% in the cohort), by definition, are not sarcomas, and therefore the major conclusion that this model can distinguish sarcoma is not substantiated by the data. The signature can impressively distinguish all bone lesions from healthy controls in the test/training set, and there is only a minimal ability (figure 3C) to distinguish malignant vs benign. The authors may want to contend that a benign bone lesion may undergo malignant transformation, and therefore distinguishing benign vs intermediate vs malignant is not critical in their model.

2. The title, abstract, and body of the manuscript state that the approach has been validated, yet the validation in a very small (25 patient) external dataset did not perform nearly as well as the test/training set (65%). This is not unusual performance for such assays when there are differences in sample processing, chips, etc, though it fails to achieve the bar of validation as highlighted in the title and elsewhere. Of interest, the external dataset contained patients with varied sarcomas (i.e., malignant tumors), yet the test/training sets were composed of patients who had malignant and benign lesions. Separating out benign from malignant, and possibly eliminating patients with “intermediate” tumors from your analysis may lead to a more robust predictive model.

3. The broad title focused on miRNA signatures for sarcoma does not reflect the findings in the manuscript since it’s not only sarcoma that was differentiated from healthy controls, but rather bone lesions that may or may not be sarcoma.

Minor critique:

1. In the introductory background about the promise of early detection of patients with sarcoma from healthy controls, it would be more compelling to describe practical application, such as a patient presenting with a bone lesion and this model signature can differentiate benign from malignant. Further clinical application of such an assay could be as a post-treatment screening test for patients with sarcoma to identify early recurrence of disease. However, based on the major critiques above, this may not be a potential application of the described research.

Reviewer #3: This manuscript utilizes an in silico analysis of miRNA database to develop a predictive model for the detection of sarcomas from serum samples.

Some suggestions for the authors consideration:

1. Line 160, the authors states that 96 gene pairs were selected to construct the prediction model, however in the prediction model the authors state that a total of 13 gene pairs were used. The authors should consider in explaining further as to how they further narrowed down the gene pairs from 96 to 13 and perhaps also why they felt that was necessary.

2. In Line 178, the authors state that the gene pair prediction model showed an AUC of 68.4% in the outside data set. This is a large drop in the ability of the model to predict sarcoma when using an external data set. In some literature an AUC of 0.7-0.8 is considered acceptable and 0.8-0.9 is considered excellent. The authors conclusion that their model has high accuracy and sensitivity does not seem to be consistent when they tested their model to the external data set. The AUC of 100 in the internal data set, may simply denote that differences in the methodologies used to measure the serum miRNA may have a larger impact than the gene-pair model they developed. The authors should explain the implications of the drop in the AUC in regards to the broader applicability of the model.

3. The authors conclude that their test is very sensitive. It would be very important for the authors to better state the stage of the disease when the serum samples were obtained. If the serum samples are all obtained when the patients have active detectable disease, what would be the applicability of this model? Knowing that a patient had a "sarcoma" may not necessarily be practically beneficial, as a biopsy would still be required to define histological type. In lines 239-240, the authors state that this model "is promising in the diagnosis of sarcoma". However, that information, even if the AUC was better than 68% would not be sufficient for making treatment decisions. How well does this model work in the minimal residual disease state? This may have broader applicability to replace serial imaging? The authors may consider narrowing their conclusions to state that the model they have generated was outstanding in their internal test data set, but had significant reduction in the AUC when exposed to an internal data set in detecting sarcomas in patients with known active disease.

4. in the methods section for the sake of completeness, the authors may consider adding on which platform the statistical analysis was performed.

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

Reviewer #3: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files to be viewed.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2020 Jul 16;15(7):e0236097. doi: 10.1371/journal.pone.0236097.r002

Author response to Decision Letter 0

20 Apr 2020

Editor:

Answer: Thank you for your kind reminding. We have carefully modified the style of the manuscript by referring to the provided template.

Answer: Thank you for your kind reminding. We have updated the Data Availability Statement as required. All the original data used in this study was downloaded from the public database and the code used to reproduce the results was provided as supplementary files.

In your revised cover letter, please address the following prompts:

We will update your Data Availability statement on your behalf to reflect the information you provide.

Answer: We have revised the cover letter as required. There is no ethical or legal restriction on sharing the datasets used in this study. The data sources and detailed methods used to reproduce the results of this study have been indicated in the manuscript and supplementary files.

3. Please include a copy of Table 1 and 2 which you refer to in your text on page 7.

Answer: We have added the Tables to the end of the manuscript as required.

Answer: We have added the captions of supplementary files in the manuscript.

Lines 59-60: Please further explain exactly what is meant by the statement: “detecting the level of miRNAs in body fluids, the state of somatic tumor cells can be inferred..”

Answer: We really appreciate for your hard work. We further explained it in the manuscript. (lines 54 - 62).

Lines 106-107: What does it mean for the miRNAs to be “paired in the order from front to back”?

It is still not clear how the miRNAs were paired. It would be helpful to more of the general audience to further explain this process.

Answer: Thank you for your suggestions. We first arranged the genes in ascending order according to their names in the expression matrix, and then made non-repeating pairings. Since the orders within gene pairs has no effect on the prediction model, we removed this section of description. Instead, we provide the code for building gene pairs in the supplemental files. (S1 File)

Lines 159-160: How were the 96 gene pairs, from more than 11,0000 pairs, selected to construct the prediction model.

But then in lines 163-164 it is stated that a total of 13 gene pairs, containing 16 miRNAs were used. All of this process is very confusing and difficult to follow for the reader.

Answer: We apologize for the inconvenience caused by our unclear description. We optimized the process and used the t test to screen out genes and gene pairs that had significant differences between healthy and malignant sarcoma samples. (lines 93 - 109)

Lines 173-174: It would be informative to provide more information regarding the other models used for prediction development and the comparison with this particular model.

Answer: Thank you for your kind suggestions. We compared the gene pair classifier with the gene expression based LDA model of Asano. (line 187-188, 232-234)

Finally, all of the figures are extremely difficult to review. The clarity is extremely poor.

Answer: We apologize for that mistake. We have reorganized all the figures to solve this problem.

Major critiques:

Answer: We really appreciate for your genius ideas. According to your idea, we tried to build a model to distinguish benign and malignant sarcomas, but the result was poor. These results were consistent with the conclusions of Asano's paper, and the two types of tumors could not be distinguished by PCA of miRNAs. But with reference to your ideas, we reselected the samples. Only healthy samples and malignant samples were retained, and the model was built. Our model showed high specificity in both external and internal data sets and has potential for clinical screening and complementary diagnosis.

Answer: Thank you for your kind suggestions. By eliminating both intermediate and benign tumors from the dataset, we have constructed a more robust predictive model for sarcoma discrimination.

Answer: Thank you for your reasonable concern. In order to make the experiment design more scientific, benign and intermediate samples were eliminated in this version of manuscript.

Minor critique:

Answer: It was a good idea, and we tried it. In the data set of Asano, only samples of benign and malignant were retained, and these samples were randomly divided into training sets and verification sets in a ratio of 3:1. Using gene pairs to build models in the training set can achieve more than 90% accuracy in the validation set. However, due to the lack of external data sets, we could not find other sequencing data of serum miRNA samples from benign and malignant samples in the database, so we could not evaluate the performance of the model in external data. In the present version of manuscript, we modified the datasets and methods for model construction. The model was able to distinguish sarcoma samples from healthy samples in external data sets with an accuracy of 74.3%.

Reviewer #3: This manuscript utilizes an in silico analysis of miRNA database to develop a predictive model for the detection of sarcomas from serum samples.

Some suggestions for the authors consideration:

Answer: Thank you for your reasonable concern. In the actual modeling process, we tested a series of parameters, aiming to reduce the number of genes needed as much as possible while ensuring the accuracy of the model. In the current version of manuscript, we've optimized this process to make it easier to understand.

Answer: Thank you. In order to reduce this gap and improve the accuracy of the model in the external data set, we restrict the data set, only healthy and malignant samples were retained. In the present version of manuscript, the model achieved an accuracy of 74.3% in the external dataset with a specificity of 90%.

Answer: Thank you for your kind suggestions. In the present version of manuscript, we provided the stage information in Table 1. As you can see, early stage patients (stage Ⅰ and stage Ⅱ) accounts for 40% in the dataset. It indicated that our gene pair based classifier could identify sarcomas of all stages. We narrowed our conclusions as suggested (line 1,29,49,248). We would like our gene pair based classifier to be a screening method for sarcoma.

4. in the methods section for the sake of completeness, the authors may consider adding on which platform the statistical analysis was performed.

Answer: Thank you for your kind suggestions, we have described the software and statistical methods used in our analysis as recommended. (line 143-152)

Attachment

Submitted filename: Response to Reviewers.docx

Click here for additional data file.^{(23.3KB, docx)}

PLoS One. doi: 10.1371/journal.pone.0236097.r003

Decision Letter 1

David M Loeb

3 Jun 2020

PONE-D-19-34131R1

A novel serum miRNA-pair classifier for diagnosis of sarcoma

PLOS ONE

Dear Dr. Zhu,

Please address the 2 minor concerns raised by Reviewer 2.

Please submit your revised manuscript by Jul 18 2020 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.
A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.
An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols

We look forward to receiving your revised manuscript.

Kind regards,

David M Loeb

Academic Editor

PLOS ONE

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #2: All comments have been addressed

Reviewer #3: All comments have been addressed

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

Reviewer #2: Yes

Reviewer #3: Partly

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #2: Yes

Reviewer #3: I Don't Know

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

Reviewer #2: Yes

Reviewer #3: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

Reviewer #2: Yes

Reviewer #3: Yes

**********

6. Review Comments to the Author

Reviewer #2: The authors have addressed reviewer critiques. An additional limitation of the study that was not addressed in the discussion is pooling of all stages of sarcoma, as this may skew results/accuracy/specificity in that higher stage patients may have more biologically aggressive disease and circulating tumor biomarkers. Subset analyses, if numbers were high enough, would be an additional important part of future investigation and validation of the signature.

Reviewer #3: the authors have addressed all the reviewer's comments. However, they have changed some language, which may require further clarification.

1. Statistical analysis: in the original manuscript, the authors used AUC. In this version, the authors changed the analysis to accuracy. They should explain in the statistical analysis how they define accuracy. They also should include that they are going to describe sensitivity, specificity, negative and positive predictive values in the statistical section.

2. Results: characteristic datasets. In the original manuscript the authors had included intermediate and low grade tumors. As per the authors, they have excluded these from this analysis. They may want to clarify in the characteristics of datasets section that they only included high-grade lesions.

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #2: No

Reviewer #3: No

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2020 Jul 16;15(7):e0236097. doi: 10.1371/journal.pone.0236097.r004

Author response to Decision Letter 1

7 Jun 2020

Answer: Thank you for your kind suggestions. In this study, accuracy=(TP+TN)/(TP+TN+FP+FN)

(TP: True positive; TN: True negative; FP: False positive; FN: False negative), we have explained it in statistical analysis as suggested (line: 152-156) and expanded the results (line:187-193).

Answer: Thank you for your kind reminding. We have clarified it in the characteristics of datasets section as suggested (line: 160).

Attachment

Submitted filename: Response to Reviewers.docx

Click here for additional data file.^{(14.9KB, docx)}

PLoS One. doi: 10.1371/journal.pone.0236097.r005

Decision Letter 2

David M Loeb

30 Jun 2020

A novel serum miRNA-pair classifier for diagnosis of sarcoma

PONE-D-19-34131R2

Dear Dr. Zhu,

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

David M Loeb

Academic Editor

PLOS ONE

Additional Editor Comments (optional):

Reviewers' comments:

PLoS One. doi: 10.1371/journal.pone.0236097.r006

Acceptance letter

David M Loeb

6 Jul 2020

PONE-D-19-34131R2

A novel serum miRNA-pair classifier for diagnosis of sarcoma

Dear Dr. Zhu:

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.

If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org.

If we can help with anything else, please email us at plosone@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Dr. David M Loeb

Academic Editor

PLOS ONE

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

S1 Table. Clinical characteristics of the external datasets.

(DOCX)

Click here for additional data file.^{(17.7KB, docx)}

S2 Table. Nodes information of miRNA-mRNA network.

(DOCX)

Click here for additional data file.^{(22.7KB, docx)}

S3 Table. Edges information of miRNA-mRNA network.

(CSV)

Click here for additional data file.^{(16.9KB, csv)}

S4 Table. GO and KEGG enrichment items of miRNA network.

(XLSX)

Click here for additional data file.^{(52.4KB, xlsx)}

S1 Codes. Codes and original data for reproducing the results in this study.

(ZIP)

Click here for additional data file.^{(1.5MB, zip)}

Attachment

Submitted filename: Response to Reviewers.docx

Click here for additional data file.^{(23.3KB, docx)}

Attachment

Submitted filename: Response to Reviewers.docx

Click here for additional data file.^{(14.9KB, docx)}

Data Availability Statement

[pone.0236097.ref001] 1.Tobias J, Hochhauser D. Bone and soft-tissue sarcomas. Cancer and its Management 2014. p. 446–69. [Google Scholar]

[pone.0236097.ref002] 2.Casali PG, Abecassis N, Aro HT, Bauer S, Biagini R, Bielack S, et al. Soft tissue and visceral sarcomas: ESMO-EURACAN Clinical Practice Guidelines for diagnosis, treatment and follow-up. Ann Oncol. 2018;29(Supplement_4):iv268–iv9. Epub 2018/10/05. 10.1093/annonc/mdy321 . [DOI] [PubMed] [Google Scholar]

[pone.0236097.ref003] 3.de Planell-Saguer M, Rodicio MC. Analytical aspects of microRNA in diagnostics: a review. Analytica chimica acta. 2011;699(2):134–52. Epub 2011/06/28. 10.1016/j.aca.2011.05.025 . [DOI] [PubMed] [Google Scholar]

[pone.0236097.ref004] 4.Lu J, Getz G, Miska EA, Alvarez-Saavedra E, Lamb J, Peck D, et al. MicroRNA expression profiles classify human cancers. Nature. 2005;435(7043):834–8. 10.1038/nature03702 . [DOI] [PubMed] [Google Scholar]

[pone.0236097.ref005] 5.D'Souza-Schorey C, Clancy JW. Tumor-derived microvesicles: shedding light on novel microenvironment modulators and prospective cancer biomarkers. Genes Dev. 2012;26(12):1287–99. Epub 2012/06/21. 10.1101/gad.192351.112 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0236097.ref006] 6.Kosaka N, Yoshioka Y, Fujita Y, Ochiya T. Versatile roles of extracellular vesicles in cancer. J Clin Invest. 2016;126(4):1163–72. Epub 2016/03/15. 10.1172/JCI81130 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0236097.ref007] 7.Schwarzenbach H, Nishida N, Calin GA, Pantel K. Clinical relevance of circulating cell-free microRNAs in cancer. Nat Rev Clin Oncol. 2014;11(3):145–56. Epub 2014/02/05. 10.1038/nrclinonc.2014.5 . [DOI] [PubMed] [Google Scholar]

[pone.0236097.ref008] 8.Asano N, Matsuzaki J, Ichikawa M, Kawauchi J, Takizawa S, Aoki Y, et al. A serum microRNA classifier for the diagnosis of sarcomas of various histological subtypes. Nat Commun. 2019;10. ARTN 1299 10.1038/s41467-019-09143-8. WOS:000461881700004. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0236097.ref009] 9.Leek JT, Scharpf RB, Bravo HC, Simcha D, Langmead B, Johnson WE, et al. Tackling the widespread and critical impact of batch effects in high-throughput data. Nature reviews Genetics. 2010;11(10):733–9. Epub 2010/09/15. 10.1038/nrg2825 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0236097.ref010] 10.Song L, Langfelder P, Horvath S. Random Generalized Linear Model: A Highly Accurate and Interpretable Ensemble Predictor. Bmc Bioinformatics. 2013;14(1):5. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0236097.ref011] 11.Fan YN, Habib M, Xia JG. Xeno-miRNet: a comprehensive database and analytics platform to explore xeno-miRNAs and their potential targets. Peerj. 2018;6. ARTN e5650 10.7717/peerj.5650. WOS:000446948300010. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0236097.ref012] 12.Szklarczyk D, Franceschini A, Wyder S, Forslund K, Heller D, Huerta-Cepas J, et al. STRING v10: protein-protein interaction networks, integrated over the tree of life. Nucleic Acids Res. 2015;43(Database issue):D447–52. Epub 2014/10/30. 10.1093/nar/gku1003 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0236097.ref013] 13.Fricke A, Ullrich PV, Heinz J, Pfeifer D, Scholber J, Herget GW, et al. Identification of a blood-borne miRNA signature of synovial sarcoma. Mol Cancer. 2015;14:151 Epub 2015/08/08. 10.1186/s12943-015-0424-z [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0236097.ref014] 14.Hammond SM. An overview of microRNAs. Adv Drug Deliv Rev. 2015;87:3–14. Epub 2015/05/17. 10.1016/j.addr.2015.05.001 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0236097.ref015] 15.Lu J, Getz G, Miska EA, Alvarez-Saavedra E, Lamb J, Peck D, et al. MicroRNA expression profiles classify human cancers. Nature. 2005;435(7043):834–8. Epub 2005/06/10. 10.1038/nature03702 . [DOI] [PubMed] [Google Scholar]

[pone.0236097.ref016] 16.Li B, Cui Y, Diehn M, Li R. Development and Validation of an Individualized Immune Prognostic Signature in Early-Stage Nonsquamous Non-Small Cell Lung Cancer. JAMA oncology. 2017;3(11):1529–37. Epub 2017/07/09. 10.1001/jamaoncol.2017.1609 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0236097.ref017] 17.Peng PL, Zhou XY, Yi GD, Chen PF, Wang F, Dong WG. Identification of a novel gene pairs signature in the prognosis of gastric cancer. Cancer medicine. 2018;7(2):344–50. Epub 2017/12/29. 10.1002/cam4.1303 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0236097.ref018] 18.P S, J W, Y T, C X, Medicine ZXJ. Gene pair based prognostic signature for colorectal colon cancer. 2018;97(42):e12788. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

A novel serum miRNA-pair classifier for diagnosis of sarcoma

Zheng Jin

Shanshan Liu

Pei Zhu

Mengyan Tang

Yuanxin Wang

Yuan Tian

Dong Li

Xun Zhu

Dongmei Yan

Zhenhua Zhu

Roles

Abstract

Introduction

Materials and methods

Data source

Fig 1. Flow diagram of this study.

Screening miRNAs and constructing gene pairs

Classifier construction and validation

miRNA network and function enrichment analysis

Statistical analysis

Results

Characteristics of datasets

Table 1. Clinical information of dataset GSE4158.

Gene screening and the construction of gene pairs

Prediction mode

Table 2. Gene pairs for RGLM prediction model.

Table 3. Predictive accuracy of the gene pair classifier.

miRNA network and function enrichment

Fig 2. miRNA network based on the miRNAs in the prediction model.

Fig 3. Biological process enrichment results of the miRNA network.

Discussion

Conclusions

Supporting information

Data Availability

Funding Statement

References

Decision Letter 0

David M Loeb

Roles

Author response to Decision Letter 0

Decision Letter 1

David M Loeb

Roles

Author response to Decision Letter 1

Decision Letter 2

David M Loeb

Roles

Acceptance letter

David M Loeb

Roles

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases