Skip to main content
PLOS ONE logoLink to PLOS ONE
. 2020 Jun 4;15(6):e0233713. doi: 10.1371/journal.pone.0233713

Peripheral blood transcriptome identifies high-risk benign and malignant breast lesions

Hong Hou 1,#, Yali Lyu 2,#, Jing Jiang 3, Min Wang 2, Ruirui Zhang 2, Choong-Chin Liew 4,5,6, Binggao Wang 1,*, Changming Cheng 2,*
Editor: Sumitra Deb7
PMCID: PMC7272048  PMID: 32497068

Abstract

Background

Peripheral blood transcriptome profiling is a potentially important tool for disease detection. We utilize this technique in a case-control study to identify candidate transcriptomic biomarkers able to differentiate women with breast lesions from normal controls.

Methods

Whole blood samples were collected from 50 women with high-risk breast lesions, 57 with breast cancers and 44 controls (151 samples). Blood gene expression profiling was carried out using microarray hybridization. We identified blood gene expression signatures using AdaBoost, and constructed a predictive model differentiating breast lesions from controls. Model performance was then characterized by AUC sensitivity, specificity and accuracy. Biomarker biological processes and functions were analyzed for clues to the pathogenesis of breast lesions.

Results

Ten gene biomarkers were identified (YWHAQ, BCLAF1, WSB1, PBX2, DDIT4, LUC7L3, FKBP1A, APP, HERC2P2, FAM126B). A ten-gene panel predictive model showed discriminatory power in the test set (sensitivity: 100%, specificity: 84.2%, accuracy: 93.5%, AUC: 0.99). These biomarkers were involved in apoptosis, TGF-beta signaling, adaptive immune system regulation, gene transcription and post-transcriptional protein modification.

Conclusion

A promising method for the detection of breast lesions is reported. This study also sheds light on breast cancer/immune system interactions, providing clues to new targets for breast cancer immune therapy.

Introduction

Breast cancer is the most frequently diagnosed cancer and the leading cause of cancer death in women worldwide [1]. In recent years, the incidence of breast cancer in China has been increasing, and may eventually surpass incidence rates in developed countries [2]. According to the latest GLOBOCAN 2018 report, the age-standardized incidence of breast cancer per 100,000 population in China was 36.1, which is less than half that of the United States (84.9) and the United Kingdom (93.6), although the age-standardized mortality rates per 100,000 population do not differ appreciably between China at 8.8, America at 12.7 and the United Kingdom at 14.4 [3]. The relatively high death rate for breast cancer in China is mainly due to the rapid rise in the incidence of disease, whereas incidence is stable or decreasing in Western countries [4]. The annual percentage increase in breast cancer incidence from 1999 to 2008 is over 2% in urban China and is as high as 5.5% to 6.0% in rural China [5]. It has been predicted that the number of breast cancer patients in China in 2021 will approach 2.5 million in women aged 55–69 years [6]. In addition, a large proportion of breast cancer in China occurs in younger patients who are diagnosed at age less than 50 years, whereas the peak age of breast cancer onset has been approximately 70 years in America [7].

Breast cancer is regarded as potentially curable if diagnosed and managed at an early stage. Women diagnosed with early stage breast cancer (Stage I or II) have a better prognosis (5-year survival rate, 85–98%) than do those diagnosed with advanced breast cancer (5-year survival rate for Stage III or IV, 30–70%) [8]. In addition, according to the Breast Imaging Report and Data System (BI-RADS), breast lesions mammographically classified in Group 2 as definitely benign require no more treatment than do those identified during routine mammography screening. Lesions mammographically or ultrasonographically classified into Group 3 or higher, however, are recommended for shorter follow-up intervals or biopsy in view of their unclear potential for malignancy.

Breast lesions at an early stage are usually asymptomatic and undetectable by self-examination, resulting in delayed treatment. Currently, early detection of breast lesions is mainly dependent on mammography or ultrasound [9]. However the size, nodularity, and sensitivity of the breasts during lactation, makes imaging examination a challenge during this period [10]. Though mammography screening is helpful in reducing mortality from breast cancer [11], this method of detection is often ineffective, especially when the tumor is small. Furthermore, the false-positive and false-negative rates of mammography are relatively high for women with dense breast tissue, such as pre-menopausal women or those receiving menopausal hormone therapy [12]. Compared with mammography, ultrasound has advantages for women with dense breast tissue, but due to the poor resolution of this method in soft tissue, ultrasound is more suitable as a supplemental rather than a stand-alone screening method [13]. Thus novel, minimally invasive biomarkers have been sought to improve the early detection of breast lesions.

Blood is a “fluid connective tissue” [14], and blood cells continuously interact with tissue cells throughout the entire body. Therefore blood cells can act as “sentinels” that indicate health or the presence of disease [15]. Peripheral blood is frequently used in clinical research because it is easy to access and potentially carries information about disease status and physiological responses. We have previously reported [16] that peripheral blood transcriptome profiling has been applied in the screening and early detection of various non‑hematologic disorders, including cancer [1721].

In the present study, we compare the blood gene expression profiles in women with breast lesions and control women with no breast disease in order potentially to develop a non-invasive test for early stage breast cancer and breast lesions. The transcriptomic biomarkers of breast lesions were identified and the roles of these genes in biological processes and functions were analyzed for clues to the pathogenesis of breast lesions.

Materials and methods

This study was approved by the Ethics Committee of the Qingdao Central (Tumor) Hospital (IRB no. KY-P201803601) on January 30th 2019. Participants were recruited to this study from January 31st 2019 to June 30th 2019. Sample acquisition was conducted between January 31st 2019 and June 30th 2019 at the Qingdao Central (Tumor) Hospital. 151 participants were enrolled, including 44 healthy controls and 107 patients with breast lesions (50 high risk lesions and 57 breast cancer). Written informed consent was obtained from all study participants and approved by the Ethics Committee of Qingdao Central (Tumor) Hospital. All authors in this manuscript had access to individual participants’ information and medical records, and data was scrubbed after information collection.

A total of 107 blood samples from patients with breast lesions was obtained. The study population comprised 107 female adult patients (age range, 23–78 years; mean age: 50.6 ± 11.2 years), including 50 women with high-risk breast lesions and 57 breast cancer patients. All patients were recruited before they had undergone any form of treatment, including endocrinotherapy, radio/chemo-therapy, targeted therapy or surgery. The breast lesion cohorts were categorized according to pathological examination. All patients underwent mammography or ultrasound, and the results were analyzed and categorized according to the Breast Imaging Reporting and Data System (BI-RADS) Grades [22]. In cases where the grades of mammography and ultrasound were inconsistent, the higher grade was adopted. High-risk lesions were defined as BI-RADS Grades 3 to 5 with no evidence of cancer at biopsy.

Blood collection, RNA isolation and RNA quality control

Blood samples (2.5 ml) were drawn using PaxGene Blood RNA tubes (PreAnalytix GmbH, Hombrechtikon, Switzerland) and total RNA was then isolated as described in a previous publication [11]. The integrity of the purified RNA was accessed by 2100 Bioanalyzer RNA 6000 Nano Chips (Agilent Technologies, Inc., Santa Clara, CA, USA) and the quantity of RNA was assessed by NanoDrop 1000 UV-Vis spectrophotometer (Thermo Fisher Scientific, Inc. Waltham, MA, USA). All RNA samples were assessed by RNA integrity number ≥7·0 and 28S:18S rRNA≥1.0.

Microarray hybridization and microarray data analysis

The gene expression profiles of all 151 samples, including 44 normal controls, 50 high-risk breast lesions and 57 breast cancer, were characterized by microarray hybridization as per the manufacturer’s protocol (Gene Profiling Array cGMP U133 P2 [Affymetrix; Thermo Fisher Scientific, Inc.]). Blood total RNA (200 ng for each sample) was labeled and hybridized onto Affymetrix microarray according to the manufacturer’s protocol. Gene expression profiles were accessed using Affymetrix Expression Console software (version 1.4.1; Affymetrix; Thermo Fisher Scientific, Inc.). The raw gene expression data were normalized using the MAS5 method to make it possible to compare the profiling variations among microarrays.

The data mining method utilized for this study mostly follows the strategy described in our previous report [23]. In brief, to identify gene biomarkers for distinguishing breast lesions (high-risk benign and cancer) from normal controls, the probe sets of interest were selected from the 54,675 probe sets on the Affymetrix Gene Profiling cGMP U133 P2 microarray, by filtration according to the following series criteria: the probe sets could be detected reliably (“present” call) in all the samples; the sets were present within the MAQC list as reported by MAQC Consortium; and the stably expressed probe sets, also deemed as internal reference genes, were removed. The microarray data was transformed by a logarithmic intensity to satisfy Gaussian distribution requirements. All sample data were randomly divided into a training set and a test set in a proportion of 7:3.

To accelerate the screening of breast lesion-specific gene expression signatures, an ensemble learning strategy called AdaBoost was executed. Instead of making restrictive assumptions regarding the training set as in traditional data mining methods, this boosting method first creates a set of weak classifiers by assigning them appropriate extra weights and then combines these weak classifiers into a strong classifier. AdaBoost has important and significant advantages in both accuracy and training time as compared with other data mining methods [24]. The transcriptomic features of the breast lesions were identified and used to construct the predictive model by AdaBoost. To classify the breast lesion group and the normal control group, the area under the receiver operating characteristic curve (AUC) sensitivity, specificity and accuracy were estimated in both the training and the test groups.

Bioinformatics analysis

The GO and KEGG annotations of the selected transcriptomic genes were queried from the COXPRESdb v7 database [25]. The protein-protein interactions between each transcriptomic feature and its first neighbouring protein counterpart with number less than 20 were downloaded from the STRING database with a total confidence greater than or equal to 0.7. Gene-annotation enrichment analysis using the cluster Profiler R package was performed on signature genes and their correlative proteins. Gene Ontology (GO) terms were identified with a strict cutoff of adjusted p < 0.05 corrected with the Benjamini–Hochberg (BH) method and a false discovery rate (FDR) of less than 0.05. Reactome pathways were also identified, with a strict cutoff of p < 0.05 corrected with the BH method and a false discovery rate (FDR) of less than 0.05. The protein-protein interaction network and gene network with the final biomarkers was carried out with Cytoscape software.

Results

For this study a total of 151 blood samples was collected, including 44 controls and 107 breast lesions (50 high-risk breast lesions and 57 breast cancer lesions). Patients with breast cancer were older than the controls and older than those with high-risk lesions. Most subjects in the control group were aged less than 60 years, whereas about half (49/107) of the patients in the breast lesion cohort were older than age 60 (Table 1). The BI-RADS Grades of patients in the breast lesion group are also summarized: for high-risk lesions, the number of lesions Grade 3 and 4 was similar; for breast cancer lesions, most of the patients were Grade 5 (Table 1).

Table 1. The basic characteristics of normal controls and breast lesions.

Normal controls High risk lesions Breast cancer
Age(years)
Min 26 23 33
Max 6 68 78
Mean 42.6±11.6 44.9±10.0 55.6±9.6
Total-Age groups(years)
21–30 8 5 0
31–40 12 10 4
41–50 12 22 14
51–60 11 10 19
61–70 0 3 17
71–80 1 0 3
Total 44 50 57
BI-RADS Grades
3 23 0
4 27 10
5 0 42
6 0 5
Total 50 57

The histopathology of the breast lesions is shown in Table 2. In the category of high-risk lesions, the main two types were hyperplasia-related disease and fibroadenoma. In the category of breast cancer, invasive breast cancer accounted for about 81% (46/57) of all histological types. Most of the samples were histological Grade II (26/40), 17 were unknown.

Table 2. The histopathological types of breast lesions.

Diagnosis Subtype/ Histological grade Number of samples
High risk lesion (50) Hyperplasia 21
Fibroadenoma 17
Papilloma 6
Phyllode tumor 3
Adenolipoma 1
Mammary duct ectasia 1
Lobular atrophy 1
Invasive breast cancer (46) Histological grade I 2
Histological grade II 18
Histological grade III 9
Histological grade unknown 17
Ductal carcinoma in situ (3) Histological grade I 0
Histological grade II 2
Histological grade III 1
Papillary breast cancer (2) Histological grade I 0
Histological grade II 2
Histological grade III 0
Invasive lobular carcinoma (2) Histological grade I 0
Histological grade II 2
Histological grade III 0
Squamous cell carcinoma (2) Histological grade I 0
Histological grade II 1
Histological grade III 1
Tubular carcinoma (1) Histological grade I 1
Histological grade II 0
Histological grade III 0
Mucinous carcinoma of breast (1) Histological grade I 0
Histological grade II 1
Histological grade III 0
Total samples 107

Transcriptome profiling of peripheral blood samples from normal controls and breast lesions

Transcriptome profiling of peripheral blood samples taken from women in the two cohorts (normal controls 44, breast lesions 107), were generated using Affymetrix GeneChip U133Plus2.0. The profiles were then analyzed comparing breast lesions and normal control samples. A final ten transcriptomic gene biomarkers were identified (YWHAQ, BCLAF1, WSB1, PBX2, DDIT4, LUC7L3, FKBP1A, APP, HERC2P2, FAM126B) and were able to distinguish blood samples from patients with breast lesions from normal control samples. The corresponding gene symbols and fold changes of the final ten probe sets are listed in Table 3.

Table 3. Candidate biomarkers for distinguishing breast lesions from controls.

Probe set ID Gene Symbol Gene Title Fold Change Regulation
202887_s_at DDIT4 DNA damage inducible transcript 4 2.0014469 up
214953_s_at APP amyloid beta (A4) precursor protein 1.97339 up
214119_s_at FKBP1A FK506 binding protein 1A 1.8358978 up
202876_s_at PBX2 pre-B-cell leukemia homeobox 2 1.7287292 up
200693_at YWHAQ tyrosine 3-monooxygenase/ tryptophan 5-monooxygenase activation protein, theta 1.0801506 up
217317_s_at HERC2P2 hect domain and RLD 2 pseudogene 2 -1.2864129 down
208835_s_at LUC7L3 LUC7-like 3 pre-mRNA splicing factor -1.3374902 down
201296_s_at WSB1 WD repeat and SOCS box containing 1 -1.350631 down
201101_s_at BCLAF1 BCL2-associated transcription factor 1 -1.4449192 down
1554178_a_at FAM126B family with sequence similarity 126, member B -1.4458523 down

Model selection and performance evaluation

Based on the ten candidate biomarkers we identified, a predictive model was constructed for discriminating breast lesions from normal controls using AdaBoost.

Fig 1 demonstrates using hierarchical cluster diagrams the performance of each single gene and the ten-gene panel for distinguishing breast lesions from controls for the entire 151 samples. The ten-gene panel exhibited a better performance than any of the single genes alone in clustering breast lesion samples from normal control samples.

Fig 1.

Fig 1

Heat map of gene expression and hierarchical cluster diagram showing 10 single candidate genes (A) and 10-gene combination (B) for clustering the 151 samples including 107 breast lesions and 44 normal controls. Dendrogram generated using ‘‘Heatmap” function in R with default settings.

To construct the predictive model, we divided the total data into a training set and a test set in proportions of 7:3. The predictive model built on the training set that contained a total of 105 samples included 80 breast lesions and 25 normal controls. The performance of the predictive model was then evaluated by the completely independent samples in the test set, which contained a total of 46 samples, including 27 breast lesions and 19 normal controls. The performances of the training set and the test set are shown in Table 4. In terms of specificity and accuracy both training set and test set performed well; the test set sensitivity was 100%, and specificity and accuracy were 84.2% and 93.5%, respectively. Three of the 19 normal control samples in the test set were predicted as positive results; the reason for these false-positive results requires further study in a larger cohort. The ten-gene biomarker panel also exhibited a higher ROC AUC as compared with any single biomarker, in both the training set and the test set, as shown in Fig 2. As shown in Fig 3, the box-whisker plot illustrates the well-separated distribution of prediction scores of breast lesions and normal controls, based on the 10-gene panel and AdaBoost algorithm.

Table 4. Model construction and performance evaluation.

Training set Test set
Breast lesions Normal Control Breast lesions Normal Control
Positive 80 0 27 3
Negative 0 25 0 16
Total 80 25 27 19
Sensitivity 100% 100%
Specificity 100% 84.2%
Accuracy 100% 93.5%
ROC AUC 1 0.99

Fig 2. ROC curve analysis for comparison of breast lesions versus normal controls.

Fig 2

Fig 3. Box-whisker plot to display the decision scores in breast lesions and normal controls in the training set and test set.

Fig 3

Red, breast lesions, Green, normal control.

Protein networks and functional enrichment analysis

The proteins interacting with the ten candidate biomarkers used for the model construction were downloaded from the STRING database, and a total of 147 proteins were identified with a confidence greater or equal to 0.7. The detailed interaction of these proteins is shown in Fig 4. Functional enrichment analysis was conducted and pathways were identified with a strict cutoff of adjusted p<0.05, corrected with the Benjamini–Hochberg (BH) method. Our analysis identified 53 pathways consisting of these ten transcriptomic gene biomarkers, and we chose for further analysis the top 16 pathways with the highest p-adjusted values. As indicated in Fig 5A, these pathways were mainly involved in apoptosis, TGF-beta signaling, adaptive immune system regulation, gene transcription and post-transcriptional protein modification. The relationship of the transcriptomic gene biomarkers identified and the pathways involved are indicated in Fig 5B.

Fig 4.

Fig 4

Interaction map of 10 transcriptomic gene biomarkers (red circles) and their interacting proteins (blue circles), using the edge weight cutoff 0.7 (total confidence greater or equal to 0.7).

Fig 5. Functional categorization of transcriptomic gene biomarker-related genes.

Fig 5

A: The top 16 pathways containing the 10 transcriptomic gene biomarkers. B: The relationship of the engaged transcriptomic genes and the pathways involved.

Discussion

In this study we report a method for differentiating breast lesions—including high-risk benign breast lesions and malignant breast lesions—from normal controls using blood transcriptomic gene expression analysis. We collected blood samples from healthy control women with no breast disease and from breast lesion patients, and focused on identifying blood transcriptomic features that can distinguish the two groups. We identified ten genes that can detect breast lesions with an accuracy higher than 90%. These preliminary results are encouraging, but further research is needed for validation.

As breast cancer is the leading cause of cancer death in women, early detection has played a critical role in the management of this disease, especially for those many women whose breast cancer has no symptoms [26]. High-risk breast lesions represent a group of lesions, which clinically, morphologically, and biologically heterogeneous carry an increased risk of breast cancer, albeit to various degrees [27]. The threat of high-risk though benign breast lesions should not be underestimated. High-risk breast lesions convey a high relative risk for a later breast cancer with a cumulative incidence of 29% within 25 years [2830]. Since high risk lesions are frequently also asymptomatic, we should explore new strategies for the detection of all breast lesions, including both breast cancer and high risk lesions not yet malignant.

In current clinical practice the most common tool used for the early detection of breast lesions is mammographic screening with complementary ultrasound. Definitive diagnosis requires biopsy. Since mammography carries high false positive rates and biopsy is traumatically invasive, the development of a novel, sensitive, non-invasive approach for early detection of breast lesions is essential to complement existing methods of detection.

To develop such an approach, we have utilized methods for cancer detection described in our blood transcriptome study `and our previous reports [17,31, 32], and identified a ten-gene panel (Table 3) from peripheral blood gene expression profiles. The predictive model we developed based in the ten-gene panel performed well both in the training set and test set (Figs 1 and 2). In the independent test set, the ten-gene panel differentiated breast lesions from normal controls with sensitivity of 100%, specificity of 84.2%, accuracy of 93.5% (Table 4). We are planning to follow these patients over the next few years to confirm whether those 3 false positive samples are true negative samples. Since it is essential to predict breast lesions at early stages for prevention and optimal treatment, we are interested to know whether the biomarkers identified in the present retrospective study are effective in predicting high-risk lesions or breast cancer. We also expect to further evaluate the blood based biomarkers in a future prospective study.

Among the ten candidate biomarkers we identified (YWHAQ, BCLAF1,WSB1, PBX2, DDIT4, LUC7L3, FKBP1A, APP,HERC2P2,FAM126B), five genes (DDIT4, APP, FKBP1A, PBX2, YWHAQ) were upregulated in breast lesion patients as compared with normal controls, and the other five genes were downregulated (FAM126B, BCLAF1, WSB1, LUC7L33, HERC2P2.) There were a total of 147 proteins interacting with the ten transcriptomic genes (Fig 4), and functional enrichment analysis of these proteins showed they were mainly associated with apoptosis, TGF-beta signaling, adaptive immune system regulation, gene transcription and post-transcriptional protein modification (Fig 5). The gene involved in apoptosis was YWHAQ and the gene involved in TGF-beta signaling was FKBP1A. YWHAQ also joined the process of gene transcription with DDIT4. In adaptive immune system regulation, FKBP1A participates in the calcineurin activation of NFAT and WSB1 and plays a role in antigen processing involving ubiquitination and proteasome degradation. WSB1 is also involved in the post-transcriptional protein modification process, neddylation.

The most over-expressed biomarker in the breast lesion group was DDIT4 (for DNA-damage-inducible transcript 4), also known as REDD1 or RTP801. The major function of the protein encoded by DDIT4 is to inhibit mTORC1, which is induced by various stress stimulus in the hypoxia inducible factor (HIF) family [33,34]. Pinto et al reported that high levels of DDIT4 were significantly associated with a worse prognosis (recurrence-free survival, time to progression and overall survival) in several cancer types, including breast cancer [35]. Their previous work indicated that high DDIT4 expression was also an independent factor for a shorter disease-free survival in chemotherapy-resistant triple negative breast tumors [36]. In another report, the dysregulation of basal DDIT4 gene expression in several cancer types (e.g. lung, breast, prostate) can be altered by promyelocytic leukemia (PML) and lead to mTOR activation and cancer progression [37]. DDIT4 also acts as a pro-death transcript in the calcitriol inducing endoplasmic reticulum -stress-like response in breast cancer [38]. Consistent with these reports, in our study DDIT4 was also upregulated in breast lesions, therefore it might serve as a novel prognostic biomarker and is a potential candidate for the development of targeted therapy for breast cancer.

Another upregulated gene, YWHAQ encodes the 14-3-3 proteins, which belong to a group of highly conserved proteins that are essential components of key signaling pathways involved in apoptosis and cell proliferation. These proteins interact with proteins such as Raf, BAD, protein kinase C (PKC), and phosphatidylinositol 3-kinase [39]. The products of YWHAQ (14-3-3ε) regulate TP53 through protein-protein interactions and post-translational modifications [40], and the germline variation in the TP53 network genes PRKAG2, PPP2R2B, CCNG1, PIAS1 and YWHAQ, might affect prognosis and treatment outcome in breast cancer patients [41]. TP53 is closely associated with breast cancer; women who have germline TP53 mutations have a very high risk of breast cancer of up to 85% by age 60 [42]. Combining these reports with our results suggests the TP53 network gene YWHAQ may act as a predictor and new therapy target for breast cancer.

In the present study, FKBP1A participated in both the TGF-beta signaling and calcineurin activation of NFAT. FKBP1A, also named FKBP12, is a member of the FK-506-binding protein (FKBP) family, and its expression in cells is ubiquitous [43, 44]. FKBP1A mediates the immunosuppressive and antitumor effects of rapamycin [45], widely used in the treatment of breast cancer [46, 47]. One study on Eph receptors and invasive breast carcinoma suggested that the level of FKBP1A was significantly affected by EphB6, which was a target mRNA of miR-100, the changes in miRNAs and the target mRNA may have a role in PI3K/Akt/mTOR pathways [48]. FKBP1A has also been shown to inhibit TGF-beta type 1 receptor [49] and it was found overexpressed in childhood astrocytomas, which presented as the EGFR/FKBP12/HIF-2alpha pathway [50]. While an aberration of TGF-beta type 1 receptor is associated with a significantly increased risk of breast cancer [51], FKBP1A may also be associated with an elevated risk of breast cancer, as our study indicated.

Among the downregulated genes, WSB1 is associated with antigen processing, specifically: ubiquitination and proteasome degradation and the post-transcriptional protein modification process, neddylation. WSB-1 (WD-40 repeat-containing SOCS Box protein), is the substrate recognition element of an Elongin Cullin SOCS (ECS box) E3 ubiquitin ligase complex [52] and it was identified as a transcriptional target of HIF [53]. In the only study on the role of WSB1 in breast cancer, Poujade et al found that WSB-1 plays an important role in breast cancer metastasis. By knocking down the WSB-1 gene in breast cancer cell lines, these investigators found that the downregulation of WSB-1 gene expression levels could significantly decrease the metastatic potential of breast cancer [54].

Our results were inconsistent with the above report, however, since WSB1 was decreased in our breast lesion group. The role of WSB1 in other types of cancer is also controversial; this gene was involved in pancreatic cancer progression [55] and metastatic potential of osteosarcoma [53], but its high expression was associated with good prognosis and favorable outcome of neuroblastoma [56]. So the definite function of WSB1 in breast cancer remains unclear.

The gene mutations related to carcinogenesis, such as p53, BRCA1 / BRCA2, have been widely observed in breast tumor cells; however they have not been identified in our study with significant expression variation between breast lesion and healthy control group in peripheral blood. There are several possible reasons for this. Although tumor cells could be released into a patient’s peripheral blood, the proportion of such cells as compared with white blood cells would be very low, even for patients with advanced disease. White blood cells predominate in the cell spectrum of peripheral blood, and therefore blood gene expression signatures would largely reflect these abundant blood white cells rather than the rare circulating tumor cells. In addition, as blood white cells and tumor cells play different biological roles in the process of carcinogenesis their gene expression profiles also differ. Gene expression variations in blood white cells, for example, more likely reflect interactions between the immune system and the tumor rather than reflecting intrinsic changes within the tumor cells themselves. These differences might be an important reason why the driver genes that have been observed in tumor cells did not show abnormal signals in the gene expression profile of peripheral blood in this study. Further study is required in order to identify the signaling pathways of blood cells and their interaction with cancer cells to better understand the roles of blood cells in carcinogenesis.

Our study has several limitations. First, the sample size was relatively small and different genes or more genes that have better discriminatory power may be validated among a larger independent cohort of patients. For example, our samples show some age variation among the healthy controls, the women with high risk lesions and the breast cancer patients. Age has been regarded as an important risk factor for cancer, as the incidence of most cancers increases with age. In this study, which is restricted by a limited sample size, we tried to optimize the algorithm to eliminate the interference of age factors as much as possible. However, it is hard to confirm that the biomarkers derived are completely unrelated to age.

We intend to confirm the effectiveness of our data mining method in further studies, using a larger sample size with age-matched patients. Second, the nature of the mechanisms driving the different transcriptomic biomarkers in peripheral blood is not yet clear, and the function of some biomarkers requires further study. We are currently exploring the expression differences of the ten candidate biomarkers between high-risk breast lesions and breast cancer, which study may be helpful for the differential diagnosis of high risk lesions and breast cancer.

Finally, RNA sequencing (RNAseq) has been proven an efficient tool for transcriptome analysis, especially for exploring expression signatures of unknown transcript fragments and revealing the signaling pathways beneath, An interesting subject for future study would be to compare the variations in gene expression signatures between RNAseq and the microarray method.

Using peripheral blood gene expression profiles we identified ten transcriptomic biomarkers that could distinguish women with high-risk breast lesions and breast cancer from normal controls. Our model, based in the ten transcriptomic biomarkers identified, has shown good discriminatory power between breast lesion and control subjects. Our functional enrichment analysis suggested that our candidate biomarkers were mainly involved in apoptosis, TGF-beta signaling, adaptive immune system regulation, gene transcription and post-transcriptional protein modification. This study has therefore established a promising methodology for the non-invasive detection of breast lesions, and we have also shed light on the pathogenic mechanisms of breast cancer and provided clues to new targets for breast cancer therapy, especially therapies related to immune treatment.

Supporting information

S1 Checklist. PLOS ONE clinical studies checklist.

(DOCX)

S2 Checklist. STROBE statement—checklist of items that should be included in reports of observational studies.

(DOCX)

S1 Table. Blood-based gene expression profiles.

(XLSX)

S2 Table. Risk scores of samples.

(XLSX)

Acknowledgments

The authors would like to thank Qian Shi, who performed the Affymetrix microarray studies and Isolde Prince, who helped with the editing of the manuscript.

Data Availability

Data Availability Statement: All relevant data are within the manuscript and its supporting information. The gene expression profiles and the risk score calculated by predictive model based on 10-gene panel were detailed listed in S1 and S2 Tables of Support Information.

Funding Statement

Huaxia Bangfu Technology Incorporated [http://www.hxjdyl.com/en/gongsijieshao.html] sponsored this research. Changming Cheng, Yali Lyu, Min Wang, Ruirui Zhang are employees of Huaxia Bangfu Technology Inc. Choong-Chin Liew was a consultant of Huaxia Bangfu Technology Inc. The funder provided support in the form of salaries for authors [C. Cheng, Y. Lyu, M. Wang, R. Zhang], but did not have any additional role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript. The specific roles of these authors are articulated in the ‘author contributions’ section.

References

  • 1.Ferlay J, Colombet M, Soerjomataram I, Mathers C, Parkin DM, Piñeros M, et al. Estimating the global cancer incidence and mortality in 2018: GLOBOCAN sources and methods. Int J Cancer. 2019;144:1941–1953. 10.1002/ijc.31937 [DOI] [PubMed] [Google Scholar]
  • 2.Yap YS, Lu YS, Tamura K, Lee JE, Ko EY, Park YH, et al. Insights into breast cancer in the East vs the West: a review. JAMA Oncol. 2019. May 16 10.1001/jamaoncol.2019.0620 [DOI] [PubMed] [Google Scholar]
  • 3.Bray F, Ferlay J, Soerjomataram I, Siegel R, Torre LA, Jemal A. Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin; 2018:68:394–424. 10.3322/caac.21492 [DOI] [PubMed] [Google Scholar]
  • 4.Sung H, Rosenberg PS, Chen WQ, Hartman M, Lim WY, Chia KS, et al. Female breast cancer incidence among Asian and Western populations: more similar than expected. J Natl Cancer Inst. 2015; 107 10.1093/jnci/djv107 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Sung H, Rosenberg PS, Chen WQ, Hartman M, Lim WY, Chia KS, et al. The impact of breast cancer-specific birth cohort effects among younger and older Chinese populations. Int J Cancer. 2016;139: 527–534. 10.1002/ijc.30095 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Linos E, Spanos D, Rosner BA, Linos K, Hesketh T, Qu JD, et al. Effects of reproductive and demographic changes on breast cancer incidence in China: a modeling analysis. J Natl Cancer Inst. 2008;100:1352–1360. 10.1093/jnci/djn305 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Youlden DR, Cramb SM, Yip CH, Baade PD. Incidence and mortality of female breast cancer in the Asia-Pacific region. Cancer Biol Med. 2014;11: 101–15. 10.7497/j.issn.2095-3941.2014.02.005 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Sun L, Legood R, Sadique Z, Dos-Santos-Silva I, Yang L. Cost-effectiveness of risk-based breast cancer screening programme, China. Bull World Health Organ. 2018; 96:568–577. 10.2471/BLT.18.207944 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Abay M, Tuke G, Zewdie E, Abraha TH, Grum T, Brhane E. Breast self-examination practice and associated factors among women aged 20–70 years attending public health institutions of Adwa town, North Ethiopia. BMC Res Notes. 2018; 11: 622 10.1186/s13104-018-3731-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Malmartel A, Tron A, Caulliez S. Accuracy of clinical breast examination's abnormalities for breast cancer screening: cross-sectional study. Eur J Obstet Gynecol Reprod Biol. 2019;237: 1–6. 10.1016/j.ejogrb.2019.04.003 [DOI] [PubMed] [Google Scholar]
  • 11.Bleyer A, Welch HG. Effect of three decades of screening mammography on breast-cancer incidence. N Engl J Med. 2012;367: 1998–2005. 10.1056/NEJMoa1206809 [DOI] [PubMed] [Google Scholar]
  • 12.Jørgensen KJ, Gøtzsche PC, Kalager M, Zahl PH. Breast cancer screening in Denmark: a cohort study of tumor size and overdiagnosis. Ann Intern Med. 2017; 166: 313–323. 10.7326/M16-0270 [DOI] [PubMed] [Google Scholar]
  • 13.Vourtsis A, Berg WA, Breast density implications and supplemental screening. Eur Radiol. 2019;29: 1762–1777. 10.1007/s00330-018-5668-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Ogawa M. Differentiation and proliferation of hematopoietic stem cells. Blood. 1993;81:2844–53. [PubMed] [Google Scholar]
  • 15.Liew CC, Ma J, Tang HC, Zheng R, Dempsey AA. The peripheral blood transcriptome dynamically reflects system wide biology: a potential diagnostic tool. J Lab Clin Med. 2006;147:126–32. 10.1016/j.lab.2005.10.005 [DOI] [PubMed] [Google Scholar]
  • 16.Liew CC, Method for detection of gene transcripts in blood and uses thereof. 1999. US20110003298A1
  • 17.Shi J, Cheng C, Ma J, Liew CC, Geng X. Gene expression signature for detection of gastric cancer in peripheral blood. Oncol Lett. 2018; 15:9802–9810. 10.3892/ol.2018.8577 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Marshall KW, Mohr S, Khettabi FE, Nossova N, Chao S, Bao W, et al. A blood-based biomarker panel for stratifying current risk for colorectal cancer. Int J Cancer. 2010; 126:1177–86. 10.1002/ijc.24910 [DOI] [PubMed] [Google Scholar]
  • 19.Osman I, Bajorin DF, Sun TT, Zhong H, Douglas D, Scattergood J, et al. Novel blood biomarkers of human urinary bladder cancer. Clin Cancer Res. 2006;12 (11 Pt 1): 3374–80. 10.1158/1078-0432.CCR-05-2081 [DOI] [PubMed] [Google Scholar]
  • 20.Liong L, Lim CR, Yang H, Chao S., Bong C.W., Leong WS, et al. Blood-based biomarkers of aggressive prostate cancer. PLOS ONE. 2012;7: e45802 10.1371/journal.pone.0045802 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Mok SC, Kim JH, Skates SJ, Schorge JO, Cramer DW, Lu KH, et al. Use of blood-based mRNA profiling to identify biomarkers for ovarian cancer screening. Gynecology & Obstetrics. 2017;7:6 10.4172/2161-0932.1000443 [DOI] [Google Scholar]
  • 22.Mercado CL. BI-RADS update. Radiol Clin North Am. 2014;52:481–7. 10.1016/j.rcl.2014.02.008 [DOI] [PubMed] [Google Scholar]
  • 23.Chao S, Liew CC. Mining the dynamic genome: a method for identifying multiple disease signatures using quantitative RNA expression analysis of a single blood sample. Microarrays 2015;4:671–689. 10.3390/microarrays4040671 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Zhiquan Q. Adaboost-LLP: a boosting method for learning with label proportions. IEEE. 2017. [DOI] [PubMed] [Google Scholar]
  • 25.Obayashi T, Kagaya Y, Aoki Y, Tadaka S, Kinoshita K. COXPRESdb v7: a gene coexpression database for 11 animal species supported by 23 coexpression platforms for technical evaluation and evolutionary inference. Nucleic Acids Res. 2019: 47:D55–d62. 10.1093/nar/gky1155 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Fang R, Zhu Y, Hu L, Khadka VS, Ai J, Zou H, et al. Plasma microRNA pair panels as novel biomarkers for detection of early stage breast cancer. Front Physiol. 2018;9, 1879 10.3389/fphys.2018.01879 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Morrow M, Schnitt SJ, Norton L. Current management of lesions associated with an increased risk of breast cancer. Nat Rev Clin Oncol. 2015;12:227–38. 10.1038/nrclinonc.2015.8 [DOI] [PubMed] [Google Scholar]
  • 28.Hartmann LC, Radisky DC, Frost MH, Santen RJ, Vierkant RA, Benetti LL, et al. Understanding the premalignant potential of atypical hyperplasia through its natural history: a longitudinal cohort study. Cancer Prev Res (Phila). 2014; 7:211–7. 10.1158/1940-6207.CAPR-13-0222 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Degnim AC, Visscher DW, Berman HK, Frost MH, Sellers TA, Vierkant RA, et al. Stratification of breast cancer risk in women with atypia: a Mayo cohort study. J Clin Oncol. 2007; 25:2671–7. 10.1200/JCO.2006.09.0217 [DOI] [PubMed] [Google Scholar]
  • 30.Boughey JC, Hartmann LC, Anderson SS, Degnim A.C, Vierkant R.A., Reynolds C.A.,et al. Evaluation of the Tyrer-Cuzick (International Breast Cancer Intervention Study) model for breast cancer risk prediction in women with atypical hyperplasia. J Clin Oncol. 2010;28:3591–6. 10.1200/JCO.2010.28.0784 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Han M, Liew CT, Zhang HW, Chao S, Zheng R, Yip KT, et al. Novel blood-based, five-gene biomarker set for the detection of colorectal cancer. Clin Cancer Res. 2008;14:455–60. 10.1158/1078-0432.CCR-07-1801 [DOI] [PubMed] [Google Scholar]
  • 32.Chao S, Ying J, Liew G, Marshall W, Liew CC, Burakoff R. Blood RNA biomarker panel detects both left- and right-sided colorectal neoplasms: a case-control study. J Exp Clin Cancer Res. 2013;32:44 10.1186/1756-9966-32-44 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Dennis MD, McGhee NK, Jefferson LS, Kimball SR. Regulated in DNA damage and development 1 (REDD1) promotes cell survival during serum deprivation by sustaining repression of signaling through the mechanistic target of rapamycin in complex 1 (mTORC1). Cell Signal. 2013;25:2709–16. 10.1016/j.cellsig.2013.08.038 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Lecomte S, Chalmel F, Ferriere F, Percevault F, Plu N, Saligaut C, et al. Glyceollins trigger anti-proliferative effects through estradiol-dependent and independent pathways in breast cancer cells. Cell Commun Signal. 2017;15:26 10.1186/s12964-017-0182-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Pinto JA, Rolfo C. In silico evaluation of DNA Damage Inducible Transcript 4 gene (DDIT4) as prognostic biomarker in several malignancies. Sci Rep. 2017;7:1526 10.1038/s41598-017-01207-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Pinto JA, Araujo J, Cardenas NK, Morante Z, Doimi F, Vidaurre T, et al. A prognostic signature based on three-genes expression in triple-negative breast tumours with residual disease. NPJ Genom Med. 2016; 1:15015 10.1038/npjgenmed.2015.15 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Salsman J, Stathakis A, Parker E, Chung D, Anthes LE, Koskowich KL, et al. PML nuclear bodies contribute to the basal expression of the mTOR inhibitor DDIT4. Sci Rep. 2017;7:45038 10.1038/srep45038 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Ozkaya AB, Ak H, Aydin HH. High concentration calcitriol induces endoplasmic reticulum stress related gene profile in breast cancer cells. Biochem Cell Biol. 2017; 95: 289–294. 10.1139/bcb-2016-0037 [DOI] [PubMed] [Google Scholar]
  • 39.Malaspina A, Kaushik N., Belleroche J. A 14-3-3 mRNA is up-regulated in amyotrophic lateral sclerosis spinal cord. J Neurochem. 2000;75: 2511–20. 10.1046/j.1471-4159.2000.0752511.x [DOI] [PubMed] [Google Scholar]
  • 40.Vazquez A, Bond EE, Levine AJ, Bond GL. The genetics of the p53 pathway, apoptosis and cancer therapy. Nat Rev Drug Discov. 2008;7:979–87. 10.1038/nrd2656 [DOI] [PubMed] [Google Scholar]
  • 41.Jamshidi M, Schmidt MK, Dörk T, Garcia-Closas M, Heikkinen T, Cornelissen S, et al. Germline variation in TP53 regulatory network genes associates with breast cancer survival and treatment outcome. Int J Cancer. 2013;132: 2044–55. 10.1002/ijc.27884 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Schon K, Tischkowitz M. Clinical implications of germline mutations in breast cancer: TP53. Breast Cancer Res Treat. 2018;167:417–423. 10.1007/s10549-017-4531-y [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Hidalgo M, Rowinsky EK, The rapamycin-sensitive signal transduction pathway as a target for cancer therapy. Oncogene. 2000;19:6680–6. 10.1038/sj.onc.1204091 [DOI] [PubMed] [Google Scholar]
  • 44.Shou W, Aghdasi B, Armstrong DL, Guo Q, Bao S, Charng MJ, et al. Cardiac defects and altered ryanodine receptor function in mice lacking FKBP12. Nature. 1998;391: 489–92. 10.1038/35146 [DOI] [PubMed] [Google Scholar]
  • 45.Sehgal SN. Rapamune (Sirolimus, rapamycin): an overview and mechanism of action. Ther Drug Monit. 1995;17:660–5. 10.1097/00007691-199512000-00019 [DOI] [PubMed] [Google Scholar]
  • 46.Dhandhukia JP, Li Z, Peddi S, Kakan S, Mehta A, Tyrpak D, et al. Berunda polypeptides: multi-headed fusion proteins promote subcutaneous administration of rapamycin to breast cancer in vivo. Theranostics. 2017;7:3856–3872. 10.7150/thno.19981 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Eloy JO, Petrilli R, Brueggemeier RW, Marchetti JM, Lee RJ. Rapamycin-loaded immunoliposomes functionalized with Trastuzumab: a strategy to enhance cytotoxicity to HER2-positive breast cancer cells. Anticancer Agents Med Chem. 2017;17:48–56. [PMC free article] [PubMed] [Google Scholar]
  • 48.Bhushan L, Kandpal RP. EphB6 receptor modulates micro RNA profile of breast carcinoma cells. PLOS ONE. 2011;6:e22484 10.1371/journal.pone.0022484 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Okadome T, Oeda E, Saitoh M, Ichijo H, Moses HL, Miyazono K, et al. Characterization of the interaction of FKBP12 with the transforming growth factor-beta type I receptor in vivo. J Biol Chem. 1996;271:21687–90. 10.1074/jbc.271.36.21687 [DOI] [PubMed] [Google Scholar]
  • 50.Khatua S, Peterson KM, Brown KM, Lawlor C, Santi MR, LaFleur B, et al. Overexpression of the EGFR/FKBP12/HIF-2alpha pathway identified in childhood astrocytomas by angiogenesis gene profiling. Cancer Res. 2003;63:1865–70. [PubMed] [Google Scholar]
  • 51.Wang YQ, Qi XW, Wang F, Jiang J, Guo QN. Association between TGFBR1 polymorphisms and cancer risk: a meta-analysis of 35 case-control studies. PLOS ONE. 2012; 7: e42899 10.1371/journal.pone.0042899 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Dentice M, Bandyopadhyay A, Gereben B, Callebaut I, Christoffolete MA, Kim BW, et al. The Hedgehog-inducible ubiquitin ligase subunit WSB-1 modulates thyroid hormone activation and PTHrP secretion in the developing growth plate. Nat Cell Biol. 2005;7:698–705. 10.1038/ncb1272 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Cao J, Wang Y, Dong R, Lin G, Zhang N, Wang J, et al. Hypoxia-induced WSB1 promotes the metastatic potential of osteosarcoma cells. Cancer Res. 2015;75:4839–51. 10.1158/0008-5472.CAN-15-0711 [DOI] [PubMed] [Google Scholar]
  • 54.Poujade FA, Mannion A, Brittain N, Theodosi A, Beeby E, Leszczynska KB, et al. WSB-1 regulates the metastatic potential of hormone receptor negative breast cancer. Br J Cancer. 2018;118:1229–1237. 10.1038/s41416-018-0056-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Archange C, Nowak J, Garcia S, Moutardier V, Calvo EL, Dagorn JC, et al. The WSB1 gene is involved in pancreatic cancer progression. PLOS ONE. 2008;3: e2475 10.1371/journal.pone.0002475 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Chen QR, Bilke S, Wei JS, Greer BT, Steinberg S.M., Westermann F., et al. Increased WSB1 copy number correlates with its over-expression which associates with increased survival in neuroblastoma. Genes Chromosomes Cancer. 2006; 45:856–62. 10.1002/gcc.20349 [DOI] [PubMed] [Google Scholar]

Decision Letter 0

Sumitra Deb

30 Jan 2020

PONE-D-19-31191

Peripheral blood transcriptome identifies high-risk benign and malignant breast lesions

PLOS ONE

Dear Dr Liew,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

We would appreciate receiving your revised manuscript by Mar 15 2020 11:59PM. When you are ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter.

To enhance the reproducibility of your results, we recommend that if applicable you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). This letter should be uploaded as separate file and labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. This file should be uploaded as separate file and labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. This file should be uploaded as separate file and labeled 'Manuscript'.

Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out.

We look forward to receiving your revised manuscript.

Kind regards,

Sumitra Deb, PhD

Academic Editor

PLOS ONE

Journal requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at

http://www.journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and http://www.journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2. We noticed you have some minor occurrence(s) of overlapping text with the following previous publication(s), which needs to be addressed:

https://doi.org/10.3892/ol.2018.8577

https://doi.org/10.1038/s41416-018-0056-3

In your revision ensure you cite all your sources (including your own works), and quote or rephrase any duplicated text outside the Methods section. Further consideration is dependent on these concerns being addressed.

3. Thank you for stating the following in the Financial Disclosure section:

"Funder: Huaxia Bangfu Technology Inc

https://www.hxjdyl.com/en/gongsijieshao

The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript."

   

We note that one or more of the authors are employed by commercial companies: Huaxia Bangfu Technology Inc and Golden Health Diagnostics Incorporated.

a)    Please provide an amended Funding Statement declaring this commercial affiliation, as well as a statement regarding the Role of Funders in your study. If the funding organization did not play a role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript and only provided financial support in the form of authors' salaries and/or research materials, please review your statements relating to the author contributions, and ensure you have specifically and accurately indicated the role(s) that these authors had in your study. You can update author roles in the Author Contributions section of the online submission form.

Please also include the following statement within your amended Funding Statement.

“The funder provided support in the form of salaries for authors [insert relevant initials], but did not have any additional role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript. The specific roles of these authors are articulated in the ‘author contributions’ section.”

If your commercial affiliation did play a role in your study, please state and explain this role within your updated Funding Statement.

b) Please also provide an updated Competing Interests Statement declaring this commercial affiliation along with any other relevant declarations relating to employment, consultancy, patents, products in development, or marketed products, etc.  

Within your Competing Interests Statement, please confirm that this commercial affiliation does not alter your adherence to all PLOS ONE policies on sharing data and materials by including the following statement: "This does not alter our adherence to  PLOS ONE policies on sharing data and materials.” (as detailed online in our guide for authors http://journals.plos.org/plosone/s/competing-interests) . If this adherence statement is not accurate and  there are restrictions on sharing of data and/or materials, please state these. Please note that we cannot proceed with consideration of your article until this information has been declared.

Please include both an updated Funding Statement and Competing Interests Statement in your cover letter. We will change the online submission form on your behalf.

Please know it is PLOS ONE policy for corresponding authors to declare, on behalf of all authors, all potential competing interests for the purposes of transparency. PLOS defines a competing interest as anything that interferes with, or could reasonably be perceived as interfering with, the full and objective presentation, peer review, editorial decision-making, or publication of research or non-research articles submitted to one of the journals. Competing interests can be financial or non-financial, professional, or personal. Competing interests can arise in relationship to an organization or another person. Please follow this link to our website for more details on competing interests: http://journals.plos.org/plosone/s/competing-interests

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Partly

Reviewer #2: Yes

Reviewer #3: No

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: No

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: No

Reviewer #2: Yes

Reviewer #3: Yes

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: In this manuscript authors have reported a method for detection of breast lesions using peripheral blood transcriptomic profiling. Overall the manuscript has been designed well but some issues need to be addressed. Minor: The font in Figure 1 should be made clearer. It was a little hard to read. Major: Definitely a larger sample size and more validation studies will be needed to support this method. Did authors try RNAseq instead of microarray? They need to address the fact that microarray has a limitation in which it does not allow accurate assessment of low signal intensities. And may also give background hybridization. Also the authors need to provide information on the ER, HER2 status of the lesions if available.

Reviewer #2: The authors have described a ten-gene panel signature that can be used as a predictive model to detect high risk breast lesions as well as breast cancer. The authors have performed the research very comprehensively and all the analyses performed is scientifically sound. While I do believe the research performed is of significant importance in the field, I do request the authors to address a few concerns before the manuscript can be accepted for publication:

1) On page 9, the authors mention that patients with breast cancer were older than the controls as well as the ones with high-risk lesions. How do the authors know that the ten gene panel signature isn't simply reflective of aging and is directly related to the process of carcinogensis.

2) Did the author check for the known drivers of cancer and in particular breast cancer such as BRCA1,BRCA2, p53 and such other genes? Or was the analysis done in a way to find out the other genes beyond the ones that were already identified in the literature?

3) Is there a way to follow up to see if the 3 'false-positive' normal samples in their test set go on to develop high risk breast lesions and therefore check if the ten-gene panel signature is actually predictive of the early stage carcinogenesis/development of high-risk lesions?

4) While the premise of the project and that of the manuscript is to discover a ten-gene signature panel that is predictive of developing high risk breast lesions, all the samples analyzed are grade 3 or 4 (BI-RADS). Can authors actually test if this panel is actually predictive of development of high-risk lesions by testing their gene signature panel in the grade I or grade II patients?

5) Have the authors tested to see if among the 147 protein interacting partners, the transcripts of the interacting partners were also altered in their RNAseq analyses? This is only to understand if multiple genes in the pathways described in the manuscript have been altered, and therefore narrow down a potential pathway of interest for further studies?

Once these comments are addressed, the manuscript can be considered for publication.

Reviewer #3: Comments:

1. It is not clear from the description about the parameters used to identify the 10 candidate genes.

2. It would be interesting to know the comparison of the various gene expression profile between the high risk benign breast lesion vs malignant breast lesions

3. Since the sample size is small it is important to use an age matched control.

4. The false positivity observed in 3 of the 19 controls of the predictive model is not justified despite of small sample size

5. Significant figures should be considered while calculating Fold change.

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

Reviewer #3: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files to be viewed.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2020 Jun 4;15(6):e0233713. doi: 10.1371/journal.pone.0233713.r002

Author response to Decision Letter 0


23 Feb 2020

Journal requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at

http://www.journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and http://www.journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

Response: We hope we have met PLOS ONE style requirements.

2. We noticed you have some minor occurrence(s) of overlapping text with the following previous publication(s), which needs to be addressed:

https://doi.org/10.3892/ol.2018.8577

https://doi.org/10.1038/s41416-018-0056-3

In your revision ensure you cite all your sources (including your own works), and quote or rephrase any duplicated text outside the Methods section. Further consideration is dependent on these concerns being addressed.

Response: the duplicated text with previous publications has been rephrased on page 6/7 and page 20.

3. Thank you for stating the following in the Financial Disclosure section:

"Funder: Huaxia Bangfu Technology Inc

https://www.hxjdyl.com/en/gongsijieshao

The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript."

We note that one or more of the authors are employed by commercial companies: Huaxia Bangfu Technology Inc and Golden Health Diagnostics Incorporated.

a) Please provide an amended Funding Statement declaring this commercial affiliation, as well as a statement regarding the Role of Funders in your study. If the funding organization did not play a role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript and only provided financial support in the form of authors' salaries and/or research materials, please review your statements relating to the author contributions, and ensure you have specifically and accurately indicated the role(s) that these authors had in your study. You can update author roles in the Author Contributions section of the online submission form.

Please also include the following statement within your amended Funding Statement.

“The funder provided support in the form of salaries for authors [insert relevant initials], but did not have any additional role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript. The specific roles of these authors are articulated in the ‘author contributions’ section.”

If your commercial affiliation did play a role in your study, please state and explain this role within your updated Funding Statement.

b) Please also provide an updated Competing Interests Statement declaring this commercial affiliation along with any other relevant declarations relating to employment, consultancy, patents, products in development, or marketed products, etc.

Within your Competing Interests Statement, please confirm that this commercial affiliation does not alter your adherence to all PLOS ONE policies on sharing data and materials by including the following statement: "This does not alter our adherence to PLOS ONE policies on sharing data and materials.” (as detailed online in our guide for authors http://journals.plos.org/plosone/s/competing-interests) . If this adherence statement is not accurate and there are restrictions on sharing of data and/or materials, please state these. Please note that we cannot proceed with consideration of your article until this information has been declared.

Please include both an updated Funding Statement and Competing Interests Statement in your cover letter. We will change the online submission form on your behalf.

Please know it is PLOS ONE policy for corresponding authors to declare, on behalf of all authors, all potential competing interests for the purposes of transparency. PLOS defines a competing interest as anything that interferes with, or could reasonably be perceived as interfering with, the full and objective presentation, peer review, editorial decision-making, or publication of research or non-research articles submitted to one of the journals. Competing interests can be financial or non-financial, professional, or personal. Competing interests can arise in relationship to an organization or another person. Please follow this link to our website for more details on competing interests: http://journals.plos.org/plosone/s/competing-interests

Response: We have added the following statements to our Cover Letter:

Funding Statement

Huaxia Bangfu Technology Incorporated [http://www.hxjdyl.com/en/gongsijieshao.html] sponsored this research. Changming Cheng, Yali Lyu, Min Wang, Ruirui Zhang are employees of Huaxia Bangfu Technology Inc. Choong-Chin Liew was a consultant of Huaxia Bangfu Technology Inc. The funder provided support in the form of salaries for authors [C. Cheng, Y. Lyu, M. Wang, R. Zhang], but did not have any additional role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript. The specific roles of these authors are articulated in the ‘author contributions’ section.

Competing Interests Statement

The authors have read the journal’s policy and have the following conflicts: Changming Cheng, Yali Lyu, Min Wang, Ruirui Zhang are employees of Huaxia Bangfu Technology Inc who sponsored this research. Choong-Chin Liew was a consultant of Huaxia Bangfu Technology Inc and the founder of Golden Health Diagnostics Incorporated. None of the other authors has any competing interests. There are no patents, products in development or marketed products to declare. This does not alter the authors’ adherence to all the PLOS ONE policies on sharing data and materials.

Author contributions

Hong Hou Conceptualization, Data curation, Resources

Yali Lyu Conceptualization, Project administration, Writing - original draft,

Jing Jiang Data curation, Resources

Min Wang Data curation, Formal analysis, Software

Ruirui Zhang Formal analysis, Methodology, Visualization

Changming Cheng Conceptualization, Supervision, Writing - review & editing, Funding acquisition

Choong-Chin Liew Conceptualization, Writing - review & editing

Binggao Wang Conceptualization, Data curation, Resources, Supervision

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Partly

Reviewer #2: Yes

Reviewer #3: No

________________________________________

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: No

________________________________________

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Data Availability Statement: All relevant data are within the manuscript and its supporting information. The gene expression profiles and the risk score calculated by predictive model based on 10-gene panel were detailed listed in S1 and S2 Tables of Support Information.

Reviewer #1: No

Reviewer #2: Yes

Reviewer #3: Yes

________________________________________

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

________________________________________

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: In this manuscript authors have reported a method for detection of breast lesions using peripheral blood transcriptomic profiling. Overall the manuscript has been designed well but some issues need to be addressed. Minor: The font in Figure 1 should be made clearer. It was a little hard to read.

Response: Figure 1 was rephrased on page 13 and I hope that the font is clearer now.

Fig. 1. Heat map of gene expression and hierarchical cluster diagram showing 10 single candidate genes (A) and a 10-gene combination (B) for clustering the 151 samples including 107 breast lesions and 44 normal controls. Dendrogram generated using ‘‘Heatmap’’ function in R with default settings.

Major: Definitely a larger sample size and more validation studies will be needed to support this method. Did authors try RNAseq instead of microarray? They need to address the fact that microarray has a limitation in which it does not allow accurate assessment of low signal intensities. And may also give background hybridization. Also the authors need to provide information on the ER, HER2 status of the lesions if available.

Response: Thanks for this constructive suggestion. Certainly RNAseq is a new and powerful tool for transcriptomic study, especially suitable for exploring unknown transcript fragments and RNA sequence variation. However, Affymetrix microarray has also been proven to be a robust and accurate technology for gene expression profiling studies. The goal of our study is to develop a technology that can exploit gene expression signatures characteristic of various breast lesion types. In previous publications, we identified a series of gene biomarkers for cancer (see references 17,31,32) that were characterized using microarray analysis and were further confirmed by RT-PCR. Thus we think Affymetrix microarray analysis is a reliable method to study gene expression signatures, as in this study.

We have added to the manuscript, on page 22:

“RNA sequencing (RNAseq) has been proven an efficient tool for transcriptome analysis, especially for exploring expression signatures of unknown transcript fragments and revealing the signaling pathways beneath. An interesting subject for future study would be to compare the variations in gene expression signatures between RNAseq and the microarray method.”

The treatment of data for Affymetrix microarray hybridization has been described in detail in our previous report (Chao S, Liew CC. (2015). "Mining the Dynamic Genome: A Method for Identifying Multiple Disease Signatures Using Quantitative RNA Expression Analysis of a Single Blood Sample." Microarrays 4: 671-689.). We have added this as a reference (#23) in Materials and Methods section on page 7.

Here for your reviewers’ information is a link to this article: [https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4996407/]

ER and HER2 information from patients was not completed and was unavailable, as only some patients received ER/HER2 examination in this study.

Reviewer #2: The authors have described a ten-gene panel signature that can be used as a predictive model to detect high risk breast lesions as well as breast cancer. The authors have performed the research very comprehensively and all the analyses performed is scientifically sound. While I do believe the research performed is of significant importance in the field, I do request the authors to address a few concerns before the manuscript can be accepted for publication:

1) On page 9, the authors mention that patients with breast cancer were older than the controls as well as the ones with high-risk lesions. How do the authors know that the ten gene panel signature isn't simply reflective of aging and is directly related to the process of carcinogenesis.

Response: Certainly a larger sample size and more validation studies will be needed to support the findings described in our study. In a future report, we would like to enroll more healthy controls, more women with high risk breast lesions and more breast cancer patients with better-matched age distributions.

We have therefore added on pp. 21/22:

“…our samples show some age variation among the healthy controls, the women with high risk lesions and the breast cancer patients. Although we have shown in our discussion that the gene biomarkers we identified are related more to the process of carcinogenesis than to aging, we intend to confirm the effectiveness of our data mining method in further studies, using a larger sample size with age-matched patients.

2) Did the author check for the known drivers of cancer and in particular breast cancer such as BRCA1,BRCA2, p53 and such other genes? Or was the analysis done in a way to find out the other genes beyond the ones that were already identified in the literature?

Response: This is an interesting question for further study. The mutations of BRCA1, BRCA2 and p53 genes have been widely observed in breast cancer tissue cells and are expected to play important roles in carcinogenesis. However in the present study, we sought to identify an altered expression signature as a characteristic of blood cells rather than of cancer tissue cells. We thus did not expect to find correlations in gene expression alterations between blood cells and breast cancer cells. By analyzing the gene expression profiles of peripheral blood cells, we found that those driver genes of cancer cells such as BRCA1, BRCA2, p53 and other genes did not exhibit significant variation in blood cells as compared to healthy controls (added on Page 21). However, the gene biomarkers we identified in the blood-based transcriptome are shown to be related to the process of carcinogenesis in our bio-information analysis, as discussed in detail in Discussion section on pages 18-21. Further study to identify the signaling pathways of blood cells and their interaction with cancer cells is warranted to better understand the roles of blood cells in the carcinogenesis process.

We have added to our manuscript on p. 21:

“Mutations and abnormalities in the expression of BRCA1, BRCA2 and p53 genes have been widely observed in breast cancer tissue cells and are thought to play important roles as driver genes in carcinogenesis. In our study, however, BRCA1, BRCA2, p53 and other driver genes did not exhibit significant variations in expression between breast lesions and healthy controls in peripheral blood cells. This difference might be attributed to differences in biological functions between blood cells and tissue cells in the process of carcinogenesis. Further study is required in order to identify the signaling pathways of blood cells and their interaction with cancer cells to better understand the roles of blood cells in carcinogenesis.”

3) Is there a way to follow up to see if the 3 'false-positive' normal samples in their test set go on to develop high risk breast lesions and therefore check if the ten-gene panel signature is actually predictive of the early stage carcinogenesis/development of high-risk lesions?

Response: We have added on p.17:

“We are planning to follow over the next 3-5 years those healthy controls who presented with false-positive results, in order to determine whether some of them will go on to develop high-risk lesions or breast cancer.”

4) While the premise of the project and that of the manuscript is to discover a ten-gene signature panel that is predictive of developing high risk breast lesions, all the samples analyzed are grade 3 or 4 (BI-RADS). Can authors actually test if this panel is actually predictive of development of high-risk lesions by testing their gene signature panel in the grade I or grade II patients?

Response: Due to the limited sample size in the present study, the number of grade I or grade II patients is too small to generate a reliable predictive result. In a future study, we plan to recruit more patients with grade I or grade II to further validate the performance of this ten-gene panel in early detection of high-risk lesions and breast cancer.

5) Have the authors tested to see if among the 147 protein interacting partners, the transcripts of the interacting partners were also altered in their RNAseq analyses? This is only to understand if multiple genes in the pathways described in the manuscript have been altered, and therefore narrow down a potential pathway of interest for further studies?

Response: Thanks for this constructive suggestion. In the present study, we aim to identify blood-based genomic signatures that discriminate high-risk lesion and breast cancer from healthy controls using microarray analysis. As the reviewer suggests, RNAseq is a new and powerful tool for transcriptomic study, and is especially suitable for identifying unknown transcript fragments and exploring their signaling pathways. We would like to use RNAseq method to analyze and confirm potential pathways of interesting blood transcripts in a future study.

We have added to the manuscript, on page 22:

“RNA sequencing (RNAseq) has been proven an efficient tool for transcriptome analysis, especially for exploring expression signatures of unknown transcript fragments and revealing the signaling pathways beneath. An interesting subject for future study would be to compare the variations in gene expression signatures between RNAseq and the microarray method.”

Once these comments are addressed, the manuscript can be considered for publication.

Reviewer #3: Comments:

1. It is not clear from the description about the parameters used to identify the 10 candidate genes.

Response: the data mining method for blood-based genomic signatures has been described in our previous reports ([17,31,32]). To clarify this in the current manuscript, we have added a new reference to a previous publication that has described in detail the data mining process for blood-based signature identification (Samuel Chao, C. C., Choong-Chin Liew (2015). "Mining the Dynamic Genome: A Method for Identifying Multiple Disease Signatures Using Quantitative RNA Expression Analysis of a Single Blood Sample." Microarrays 4: 671-689.). This is added in Materials and Methods section on page 8.

We hope that the reviewer will find our methods to be robust and well-validated. Here for the reviewer’s information is a link to this article: [https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4996407/]

2. It would be interesting to know the comparison of the various gene expression profile between the high risk benign breast lesion vs malignant breast lesions

Response: This is an interesting suggestion for further study. In the present study, we expected to develop blood-based genomic signatures to discriminate high-risk lesions and breast cancer from healthy controls. In a future study, we would like to explore the evolution from high risk benign lesion to malignant breast carcinoma by comparing variations between high risk benign breast lesion and malignant breast lesions.

3. Since the sample size is small it is important to use an age matched control.

Response: we have added our response to page 22:

“…our samples show some age variation among the healthy controls, the women with high risk lesions and the breast cancer patients. Although we have shown in our discussion that the gene biomarkers we identified are related more to the process of carcinogenesis than to aging, we intend to confirm the effectiveness of our data mining method in further studies, using a larger sample size with age-matched patients.”

4. The false positivity observed in 3 of the 19 controls of the predictive model is not justified despite of small sample size

Response: We have added to page 17:

“We are planning to follow those healthy controls who presented with false-positive results over the next 3-5 years, in order to determine whether some of them will go on to develop high-risk lesions or breast cancer.”

5. Significant figures should be considered while calculating Fold change.

Response: The fold changes of the interesting 10 genes are listed in Table 3. Variation in blood gene expression profiles between cancer samples and heathy samples is usually not as significant as variations found between cancer cells and surrounding healthy tissue cells. This issue makes blood-based biomarker screening more difficult. To overcome these challenges we have developed an effective strategy for identifying blood based genomic signatures as referenced above in Question 1: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4996407/

Table 3 Candidate biomarkers for distinguishing breast lesions from controls

Probe set ID Gene Symbol Gene Title Fold Change Regulation

202887_s_at DDIT4 DNA damage inducible transcript 4 2.0014469 up

214953_s_at APP amyloid beta (A4) precursor protein 1.97339 up

214119_s_at FKBP1A FK506 binding protein 1A 1.8358978 up

202876_s_at PBX2 pre-B-cell leukemia homeobox 2 1.7287292 up

200693_at YWHAQ tyrosine 3-monooxygenase/ tryptophan 5-monooxygenase activation protein, theta 1.0801506 up

217317_s_at HERC2P2 hect domain and RLD 2 pseudogene 2 -1.2864129 down

208835_s_at LUC7L3 LUC7-like 3 pre-mRNA splicing factor -1.3374902 down

201296_s_at WSB1 WD repeat and SOCS box containing 1 -1.350631 down

201101_s_at BCLAF1 BCL2-associated transcription factor 1 -1.4449192 down

1554178_a_at FAM126B family with sequence similarity 126, member B -1.4458523 down

________________________________________

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

Reviewer #3: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files to be viewed.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool,https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org. Please note that Supporting Information files do not need this step.

Attachment

Submitted filename: Response to Reviewers - Feb.21.20.docx

Decision Letter 1

Sumitra Deb

17 Mar 2020

PONE-D-19-31191R1

Peripheral blood transcriptome identifies high-risk benign and malignant breast lesions

PLOS ONE

Dear Dr Liew,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

We would appreciate receiving your revised manuscript by May 01 2020 11:59PM. When you are ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter.

To enhance the reproducibility of your results, we recommend that if applicable you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). This letter should be uploaded as separate file and labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. This file should be uploaded as separate file and labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. This file should be uploaded as separate file and labeled 'Manuscript'.

Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out.

We look forward to receiving your revised manuscript.

Kind regards,

Sumitra Deb, PhD

Academic Editor

PLOS ONE

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #1: All comments have been addressed

Reviewer #2: All comments have been addressed

Reviewer #3: All comments have been addressed

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: (No Response)

Reviewer #2: Partly

Reviewer #3: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: (No Response)

Reviewer #2: Yes

Reviewer #3: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: (No Response)

Reviewer #2: Yes

Reviewer #3: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: (No Response)

Reviewer #2: Yes

Reviewer #3: Yes

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: (No Response)

Reviewer #2: I believe while the authors have addressed all the comments, some of them have been loosely addressed. While most of the ones are fairly minor, there are three comments that still concern me which certainly affect the validity of the studies.

1) The samples in the two groups are not age matched. In response to this comment, the author's have added this in text of the manuscript '“…our samples show some age variation among the healthy controls, the women with high risk lesions and the breast cancer patients. Although we have shown in our discussion that the gene biomarkers we identified are related more to the process of

carcinogenesis than to aging, we intend to confirm the effectiveness of our data mining

method in further studies, using a larger sample size with age-matched patients'. I believe this as a bit of ''hand-waving'' by saying that the genes biomarkers are more related to carcinogenesis than to aging. In my opinion, while they are related to carcinogenesis, they could still be related to aging and if so the predictive value of these genes would be fairly limited as with a larger sample size they may also see a significant increase in false positives.

2) In their analysis, they had 3 normal samples show up as a false-positive. While the authors state that they plan to follow this patient for 3-5 years, there is no way to ensure this would happen for sure and therefore not predictive of early stage carcinogenesis and breast lesions as the authors state.

3) It is still surprising to me that known drivers of carcinogenesis such as p53, BRCA1, BRCA2 doe not show up in their study. While authors state 'This difference might be attributed to differences in biological functions between blood cells and tissue cells in the process of carcinogenesis.', I disagree with this argument. The assumption behind assessing peripheral gene signature is that blood contains tumor cells that are now in circulation after metastasis from the primary tumor. With this rationale, some of the known driver's should have been detected.

Reviewer #3: The authors have carefully addressed the comments to the reviewers. They have also added additional references to describe the methods used in the study . However, some comments can be addressed only by further study and using bigger sample size.

I believe the authors would consider the comments suggested by the reviewers and address them in their future studies.

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

Reviewer #3: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files to be viewed.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2020 Jun 4;15(6):e0233713. doi: 10.1371/journal.pone.0233713.r004

Author response to Decision Letter 1


5 Apr 2020

Response to Reviewers

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #1: All comments have been addressed

Reviewer #2: All comments have been addressed

Reviewer #3: All comments have been addressed

________________________________________

2. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: (No Response)

Reviewer #2: Partly

Reviewer #3: Yes

________________________________________

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: (No Response)

Reviewer #2: Yes

Reviewer #3: Yes

________________________________________

4. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: (No Response)

Reviewer #2: Yes

Reviewer #3: Yes

________________________________________

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: (No Response)

Reviewer #2: Yes

Reviewer #3: Yes

________________________________________

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: (No Response)

Reviewer #2: I believe while the authors have addressed all the comments, some of them have been loosely addressed. While most of the ones are fairly minor, there are three comments that still concern me which certainly affect the validity of the studies.

1) The samples in the two groups are not age matched. In response to this comment, the author's have added this in text of the manuscript '“…our samples show some age variation among the healthy controls, the women with high risk lesions and the breast cancer patients. Although we have shown in our discussion that the gene biomarkers we identified are related more to the process of carcinogenesis than to aging, we intend to confirm the effectiveness of our data mining method in further studies, using a larger sample size with age-matched patients'. I believe this as a bit of ''hand-waving'' by saying that the genes biomarkers are more related to carcinogenesis than to aging. In my opinion, while they are related to carcinogenesis, they could still be related to aging and if so the predictive value of these genes would be fairly limited as with a larger sample size they may also see a significant increase in false positives.

Response: Thanks for the reviewer’s valuable comments. Age has been regarded as an important risk factor for cancer, as the incidence of most cancers increases with age. In this study, which is restricted by a limited sample size, we tried to optimize the algorithm to eliminate the interference of age factors as much as possible. However, as the reviewer mentioned, it is hard to confirm that the biomarkers derived are completely unrelated to age. We also state this as a limitation in the Discussion section on page 22. In further study, we plan to validate the biomarkers and the algorithm using a larger sample size with age-matched patients.

2) In their analysis, they had 3 normal samples show up as a false-positive. While the authors state that they plan to follow this patient for 3-5 years, there is no way to ensure this would happen for sure and therefore not predictive of early stage carcinogenesis and breast lesions as the authors state.

Response: We would like to confirm whether those 3 false positive samples are true negative samples by following them up in the next few years. For us, the purpose of this study is to develop a blood test to predict early stage carcinoma and breast lesions. Thus, it would be interesting and valuable to our future research to confirm whether the biomarkers identified from this retrospective study are in fact effective in predicting early stage breast lesions. We expect to explore this in prospective studies of breast lesions in the future. We also state this aim in our Discussion section on Page 17.

3) It is still surprising to me that known drivers of carcinogenesis such as p53, BRCA1, BRCA2 does not show up in their study. While authors state 'This difference might be attributed to differences in biological functions between blood cells and tissue cells in the process of carcinogenesis.', I disagree with this argument. The assumption behind assessing peripheral gene signature is that blood contains tumor cells that are now in circulation after metastasis from the primary tumor. With this rationale, some of the known driver's should have been detected.

Response: The gene mutations related to carcinogenesis, such as p53, BRCA1 / BRCA2, have been widely observed in breast tumor cells; however they have not been identified in our study. There are several possible reasons for this. Although tumor cells could be released into a patient’s peripheral blood, the proportion of such cells as compared with white blood cells would be very low, even for patients with advanced disease. White blood cells predominate in the cell spectrum of peripheral blood, and therefore blood gene expression signatures would largely reflect these abundant blood white cells rather than the rare circulating tumor cells. In addition, as blood white cells and tumor cells play different biological roles in the process of carcinogenesis their gene expression profiles also differ. Gene expression variations in blood white cells, for example, more likely reflect interactions between the immune system and the tumor rather than reflecting intrinsic changes within the tumor cells themselves. These differences might be an important reason why the driver genes that have been observed in tumor cells did not show abnormal signals in the gene expression profile of peripheral blood in this study.

We have added this in the Discussion section on Pages 21-22.

Reviewer #3: The authors have carefully addressed the comments to the reviewers. They have also added additional references to describe the methods used in the study . However, some comments can be addressed only by further study and using bigger sample size.

I believe the authors would consider the comments suggested by the reviewers and address them in their future studies.

Response: on behalf of all authors, we appreciate the reviewers’ valuable suggestions very much and would like to consider them seriously in our future studies.

________________________________________

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

Reviewer #3: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files to be viewed.]

Attachment

Submitted filename: Response to Reviewers_April.05.20.docx

Decision Letter 2

Sumitra Deb

12 May 2020

Peripheral blood transcriptome identifies high-risk benign and malignant breast lesions

PONE-D-19-31191R2

Dear Dr. Liew,

We are pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it complies with all outstanding technical requirements.

Within one week, you will receive an e-mail containing information on the amendments required prior to publication. When all required modifications have been addressed, you will receive a formal acceptance letter and your manuscript will proceed to our production department and be scheduled for publication.

Shortly after the formal acceptance letter is sent, an invoice for payment will follow. To ensure an efficient production and billing process, please log into Editorial Manager at https://www.editorialmanager.com/pone/, click the "Update My Information" link at the top of the page, and update your user information. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to enable them to help maximize its impact. If they will be preparing press materials for this manuscript, you must inform our press team as soon as possible and no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

With kind regards,

Sumitra Deb, PhD

Academic Editor

PLOS ONE

Additional Editor Comments (optional):

Reviewers' comments:

Acceptance letter

Sumitra Deb

26 May 2020

PONE-D-19-31191R2

Peripheral blood transcriptome identifies high-risk benign and malignant breast lesions

Dear Dr. Liew:

I am pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.

If your institution or institutions have a press office, please notify them about your upcoming paper at this point, to enable them to help maximize its impact. If they will be preparing press materials for this manuscript, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org.

For any other questions or concerns, please email plosone@plos.org.

Thank you for submitting your work to PLOS ONE.

With kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Dr. Sumitra Deb

Academic Editor

PLOS ONE

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 Checklist. PLOS ONE clinical studies checklist.

    (DOCX)

    S2 Checklist. STROBE statement—checklist of items that should be included in reports of observational studies.

    (DOCX)

    S1 Table. Blood-based gene expression profiles.

    (XLSX)

    S2 Table. Risk scores of samples.

    (XLSX)

    Attachment

    Submitted filename: Response to Reviewers - Feb.21.20.docx

    Attachment

    Submitted filename: Response to Reviewers_April.05.20.docx

    Data Availability Statement

    Data Availability Statement: All relevant data are within the manuscript and its supporting information. The gene expression profiles and the risk score calculated by predictive model based on 10-gene panel were detailed listed in S1 and S2 Tables of Support Information.


    Articles from PLoS ONE are provided here courtesy of PLOS

    RESOURCES