Skip to main content
PLOS ONE logoLink to PLOS ONE
. 2023 Feb 6;18(2):e0281218. doi: 10.1371/journal.pone.0281218

Cohort profile: The Clinical and Multi-omic (CAMO) cohort, part of the Norwegian Women and Cancer (NOWAC) study

André Berli Delgado 1,*, Eline Sol Tylden 1, Marko Lukic 2, Line Moi 3, Lill-Tove Rasmussen Busund 1,3, Eiliv Lund 2, Karina Standahl Olsen 2
Editor: Alvaro Galli4
PMCID: PMC9901780  PMID: 36745618

Abstract

Introduction

Breast cancer is the most common cancer worldwide and the leading cause of cancer related deaths among women. The high incidence and mortality of breast cancer calls for improved prevention, diagnostics, and treatment, including identification of new prognostic and predictive biomarkers for use in precision medicine.

Material and methods

With the aim of compiling a cohort amenable to integrative study designs, we collected detailed epidemiological and clinical data, blood samples, and tumor tissue from a subset of participants from the prospective, population-based Norwegian Women and Cancer (NOWAC) study. These study participants were diagnosed with invasive breast cancer in North Norway before 2013 according to the Cancer Registry of Norway and constitute the Clinical and Multi-omic (CAMO) cohort. Prospectively collected questionnaire data on lifestyle and reproductive factors and blood samples were extracted from the NOWAC study, clinical and histopathological data were manually curated from medical records, and archived tumor tissue collected.

Results

The lifestyle and reproductive characteristics of the study participants in the CAMO cohort (n = 388) were largely similar to those of the breast cancer patients in NOWAC (n = 10 356). The majority of the cancers in the CAMO cohort were tumor grade 2 and of the luminal A subtype. Approx. 80% were estrogen receptor positive, 13% were HER2 positive, and 12% were triple negative breast cancers. Lymph node metastases were present in 31% at diagnosis. The epidemiological dataset in the CAMO cohort is complemented by mRNA, miRNA, and metabolomics analyses in plasma, as well as miRNA profiling in tumor tissue. Additionally, histological analyses at the level of proteins and miRNAs in tumor tissue are currently ongoing.

Conclusion

The CAMO cohort provides data suitable for epidemiological, clinical, molecular, and multi-omics investigations, thereby enabling a systems epidemiology approach to translational breast cancer research.

Introduction

Breast cancer is diagnosed in more than two million individuals each year and has recently overtaken lung cancer as the most commonly diagnosed cancer worldwide [1]. It is also the leading cause of cancer related deaths among women [2], with more than 680 000 deaths globally in 2020 [3]. The increasing incidence and high mortality of breast cancer call for improved diagnostic biomarkers for early detection, as well as prognostic and predictive biomarkers for precise treatment stratification.

One of the most important factors for reducing morbidity and mortality in breast cancer is early detection, as prognosis strongly depends on the stage of the disease at diagnosis. At present, the 5-year survival of breast cancer in high-income countries is up to 99% in cases of localized disease, but less than 30% if distant metastases are present [4]. This underlines the importance of diagnostic biomarkers for early disease detection.

To account for the heterogeneity between and within tumors, breast cancer treatment is becoming increasingly tailored. The goal of precision medicine is to improve the clinical outcome by detailed knowledge of the disease and targeted treatment, based on the individual’s genetic, biomarker, phenotypic, and lifestyle characteristics [5]. Further improvement and personalization of breast cancer treatment is necessary and requires identification of new prognostic and predictive biomarkers to support clinical decision making, as well as novel personalized treatment strategies targeting molecular tumor-specific sites.

Systems science relies on the idea that studying components interacting within a system gives a better understanding of their function and effects than studying each component in isolation [6]. Systems epidemiology enables identification of contributors to disease and their interactions by combining human genomic, transcriptomic, proteomic, and metabolomic data with measurements from observational epidemiologic studies [7]. The complexity of the carcinogenic process, the latency time, and the changing lifestyle of study participants argue for a systems epidemiology approach in cancer research [8].

The large, nationally representative Norwegian Women and Cancer (NOWAC) study is therefore highly interesting since it offers very detailed epidemiological data, also on lifestyle factors, retrieved from repeated questionnaires, as well as blood samples. Here we present our systems epidemiology cohort, the Clinical and Multi-omic (CAMO) cohort, of 388 female patients with invasive breast cancer, from whom we have detailed data retrieved from the questionnaires, blood samples, tissue biopsies, medical records, and the Cancer Registry of Norway. Our cohort provides a unique opportunity to combine several types of data from a wide range of sources, thereby unlocking the potential of a systems epidemiology and precision medicine approach to breast cancer.

Materials and methods

Study population

The CAMO cohort includes 388 female breast cancer patients from the Post-genome cohort within the NOWAC study (Fig 1). The NOWAC study is a prospective cohort study, which started in 1991 and included women aged 30–70 years at recruitment, randomly selected from the Norwegian Central Population Register, and irrespective of any previous cancer diagnosis [9]. Baseline information in NOWAC was collected during the years 1991–2007, the first follow-up was conducted in 1998–2014, and the second follow-up in 2004–2011. The NOWAC study data initially comprised detailed questionnaire information on lifestyle, reproductive factors, and medication, including use of hormone replacement therapy (HRT), from more than 172 000 women.

Fig 1. Study population.

Fig 1

Venn diagram showing size and overlap of study populations in the Norwegian Women and Cancer (NOWAC) study, the NOWAC Post-genome cohort and the Clinical and Multi-omic (CAMO) cohort.

During the years 2003–2006, the NOWAC study was expanded to collect plasma, buffy coat, and blood samples for whole-genome expression profiling using the PAXgene Blood RNA system (Preanalytix/Qiagen, Hilden, Germany). Participants received a blood sampling kit via mail, and blood was drawn at general practitioners’ offices. The samples were collected irrespective of any previous breast cancer diagnosis. Isolated RNA is stored in the cohort biobank, enabling future research with the use of updated technology. The resulting NOWAC Post-genome cohort includes 50 000 randomly selected NOWAC participants born in the years 1943–57 [10]. NOWAC participants with a diagnosis of breast cancer were identified through linkage to the Cancer Registry of Norway (update of 2018), using the unique 11-digit personal number assigned to every legal resident in Norway. Information on causes of deaths were acquired from the National Register for Causes of Death, and breast cancer deaths were defined as those with the ICD-10 code 50.

The patients included in the present project are the 388 participants in the NOWAC Post-genome cohort registered in North Norway and diagnosed with breast cancer before 2013. An overview of events such as data collection periods, year of breast cancer diagnosis for the CAMO participants, and changes in clinical practice during the study period is given in Fig 2.

Fig 2. Timeline of the study period.

Fig 2

Timeline of the study period showing year of breast cancer diagnosis of the study participants (dots), as well as notable changes in screening, diagnostic, and treatment regimes in Norway (vertical lines). Line A denotes the introduction of the national breast cancer (BC) screening program, line B the introduction of chemotherapy regimens AC and FEC in the Norwegian national guidelines for BC treatment, line C the introduction of HER2 analysis in BC diagnostics, line D the use of paclitaxel and docetaxel, aromatase inhibitor, and adjuvant trastuzumab in BC treatment, and line E the introduction of Ki67 analysis and change of ER cutoff to 1% in BC diagnostics. The time periods of data collection in the NOWAC study are shown as colored, horizontal bars. Abbreviations: AC = doxorubicin (also known as Adriamycin) and cyclophosphamide; ER = estrogen receptor; FEC = 5-fluorouracil, epirubicin and cyclophosphamide.

Among the 388 CAMO participants, 313 women (81%) answered the baseline questionnaire before receiving a breast cancer diagnosis (Table 1). Also, 181 (46.8%) of the CAMO participants gave a pre-diagnostic blood sample as part of the NOWAC Post-genome cohort. Conversely, 206 CAMO women were diagnosed with breast cancer after giving a blood sample through the Post-genome cohort.

Table 1. Number of CAMO study participants with pre- or post-diagnostic information from baseline and follow-up questionnaires.

Total n Years of sampling Diagnosed before data collection (n, %) Diagnosed after data collection (n, %)
Baseline 388 1991–2007 75 (19.3) 313 (80.7)
1st follow-up 354 1998–2014 155 (43.8) 199 (56.2)
2nd follow-up 228 2004–2011 164 (71.9) 64 (28.1)
Blood sampling 388 2003–2006 181 (46.8) 206 (53.2)

Questionnaire data

At inclusion, all NOWAC participants answered a 4-8-page questionnaire regarding health, lifestyle, diet, and reproductive factors. In addition to the baseline information, the majority of the women answered one or two intermediate follow-up questionnaires at 6-8-year intervals. Furthermore, participants in the NOWAC Post-genome cohort answered a 2-page questionnaire accompanying the blood sample collection (Table 1). Questionnaires and blood sampling kits were administered irrespective of any cancer diagnosis. The questionnaires included self-reporting of family history of breast cancer (mother, sister), lifestyle factors such as smoking status (never, former, current), education, body mass index (BMI), physical activity (low, moderate high), alcohol consumption (g/day), and reproductive factors such as reproductive history (number of children, age at first birth), use of oral contraceptives, and use of HRT. For the breast cancer cases in the NOWAC cohort as a whole, we used data from the Cancer Registry to obtain age at diagnosis, and we combined age at diagnosis with menopausal information from NOWAC questionnaires to calculate the variable menopausal status at diagnosis. In the case of missing information, we classified women as pre- or postmenopausal using an age cut-off of 53 years.

Clinical database

Diagnostic biopsies and breast cancer surgery were performed at the University Hospital of North Norway in Tromsø or at the Nordland Hospital in Bodø. Clinical data were collected through manual review of medical records carried out between the years 2017 and 2019, and included age at diagnosis, treatment modality (endocrine treatment, chemotherapy, anti-HER2, radiation), cancer stage, cancer-related death, distant metastasis, and relapse after treatment, and histopathological data. All histopathological data, including receptor status, were re-evaluated by a breast pathologist in the study.

Tumor grade was evaluated clinically as part of routine diagnostic assessment, based on gland formation, nuclear pleomorphism, and mitotic count, using the criteria modified by Elston and Ellis [11]. Immunohistochemical (IHC) analyses of estrogen receptor (ER), progesterone receptor (PR) and human epidermal growth factor receptor 2 (HER2) were done on needle biopsies, also as part of routine diagnostic evaluation. Several tumor markers were reanalyzed as part of the present study. To ensure diagnostic quality, we reanalyzed ER and PR for all breast cancers diagnosed before 2001. Cut-off value for ER positivity was ≥ 1% and for PR ≥ 10% as recommended in the Norwegian national guidelines. HER2 was analyzed as part of this study for all breast cancers diagnosed before HER2-analysis became part of routine practice between 2002 and 2003. Cut-off value for HER2 positivity was an IHC score of 3+, while scores of 0–1+ were considered negative. An IHC score of 2+ led to further assessment of HER2 status by silver in situ hybridization (SISH), where a HER2/chromosome 17-ratio ≥ 2.0 was considered positive. Since Ki67 was not part of the routine diagnostic protocol before 2011, Ki67 assessment was done as part of the present study, for all luminal cancers diagnosed before 2011. Ki67 expression was measured in histological slides of tumor tissue from the primary surgery to differentiate between luminal A and luminal B tumors. Ki67 expression was evaluated in at least 500 tumor cells in the most proliferative active parts of the tumors and reported as the percentage of positive tumor cell nuclei.

Molecular subtyping of tumors was done according to recommendations by the St. Gallen International Expert Consensus and previous publications [12, 13], based on the surrogate markers ER, PR, HER2 and Ki67 as follows: luminal A (ER+ and/or PR+, HER2- and Ki67 ≤ 30%), luminal B (ER+ and/or PR+, HER2- and Ki67 > 30% or ER+ and/or PR+ and HER2+), HER2 positive (ER- and PR- and HER2+) and basal-like (ER-, PR- and HER2-).

Analyses of blood samples

For all 388 women in the clinical cohort, blood samples were collected as part of the NOWAC Post-genome cohort as described previously [10]. In a subset of these samples, i.e. approximately 100 samples, a series of omics analyses have already been carried out. For details on the laboratory methods, please refer to S1 Appendix. Briefly, mRNA gene expression profiles were analyzed from PAXgene blood RNA samples by Illumina HumanHT-12 Expression BeadChip microarrays (Illumina, Inc. San Diego, CA, USA). The analyses were carried out by a certified Illuimna service provider, the Genomics Core Facility (GCF), Norwegian University of Science and Technology, Trondheim, Norway. For plasma metabolomics, a liquid chromatography-mass spectrometry (LC-MS/MS) based kit “AbsoluteIDQ p180” (Biocrates Life Sciences, Innsbruck, Austria) was used for quantification of up to 188 metabolites. The Swedish Metabolomics Centre in Umeå, Sweden carried out the analyses. Extraction and profiling of miRNA in plasma were performed by Exiqon Services (Vedbaek, Denmark). Total RNA was extracted using the mirCURY RNA isolation kit, and a PCR-based panel of 372 probes was used for miRNA profiling (miRCURY LNA Universal RT microRNA PCR Human panel I, Qiagen, Hilden, Germany). See S1 Appendix for detailed information on the analyses of blood samples.

Analyses of tumor tissue

Archived formalin-fixed paraffin-embedded (FFPE) tissue blocks were retrieved from the two pathology labs together with the corresponding hematoxylin and eosin (HE) slides.

Tissue microarrays (TMA) were constructed from all available tumor blocks from breast cancer resection specimens from the CAMO cohort participants. The histological slides were evaluated by a pathologist and representative areas of tumor tissue in the invasive front and in the center of the tumor were carefully selected and marked. The TMAs were constructed using a tissue-arraying instrument (Beecher Instruments, Silver Spring, MD) as described in previous publications [14]. In short, a 0.6 mm-diameter stylet was used to collect several replicate tissue cores from each donor block, which were then transferred to a recipient block. Sections of 4 μm were cut with a Microm microtome HM355S (Microm, Walldorf, Germany). TMAs are used for high throughput visualization of molecular targets on DNA-, miRNA-, mRNA- or protein level, using HE staining, IHC, and in situ hybridization (ISH). Assessment of a wide range of potential biomarkers using TMAs is planned or currently ongoing [15].

In a pilot study including 108 of the CAMO participants, we extracted and analyzed l miRNA from FFPE tissue blocks (Exiqon, Vedbaek, Denmark). See S1 Appendix for details. Blood samples from the participants in this pilot study were included in the plasma metabolomics, miRNA analyses and mRNA expression analyses described above.

Statistical analysis

We compared characteristics of the participants in the CAMO cohort with the rest of the breast cancer cases in the NOWAC study by using independent sample t-test for continuous variables, or chi-square test for categorical variables. The results are presented either as means with standard deviations, medians with data range, or as percentages. All the analyses were done in STATA version 16.1 (Stata Corp, College Station, TX, USA) and SPSS version 26 (SPSS Inc, Chicago, IL, USA).

Ethical considerations

The NOWAC study and the study of miRNA expression in tumor samples from the NOWAC Post-genome cohort participants have been approved by the regional ethical committee of North Norway (REKnord 2010/1931, 2013/2271, 2014/1605). The approval covers the collection of lifestyle information, blood samples, cancer tissue, storing of data, and linkage to national registries. In addition, the Norwegian Data Protection Authority (DPA) has approved the storing of all relevant, not identifiable data, and the linkage to national registries. All the participants in NOWAC have given broad written consent to the use of the collected information and biological material for research. The participants can withdraw from the study at any time and can request that the collected samples and information are deleted. Ethical aspects have been considered within the project to ensure the most efficient and accurate use of the collected material, all in accordance with national and international guidelines and laws.

Results

For the 388 women included in the CAMO cohort, we have compiled information from baseline questionnaires, blood samples, tumor tissue, and a comprehensive clinical database (Fig 3). Median between baseline and 1st follow-up was 5,8 years (range: 4.3–10.0 years), whereas the median between 1st and 2nd follow-up was 8.5 years (range: 5.3–8.8 years).

Fig 3. Overview of sample sizes, data sources, and data types.

Fig 3

The Clinical and Multi-omic (CAMO) cohort, nested within the Norwegian Women and Cancer (NOWAC) study, provides multiple types of data from a wide range of sources, thereby enabling a systems epidemiology approach to breast cancer. These multimodal data may be combined in various ways to create complex study designs that can be used to investigate hypotheses related to breast cancer prevention, diagnostics, treatment, and survival.

The mean age at diagnosis of breast cancer in the CAMO cohort is 55.1 years compared to 57.6 years in the NOWAC study as a whole (p<0.001) (Table 2). The percentage of postmenopausal breast cancer cases in the CAMO cohort is 69.1%, which is comparable to 69.3% in the rest of the NOWAC study (p = 0.91). There were no significant differences in physical activity and smoking status between the two cohorts; the majority of the participants reported being former smokers and moderately physically active at enrolment (p = 0.31 and p = 0.82, respectively). Similarly, we found no statistically significant differences in the mean duration of education, number of children, and use of HRT. Compared to rest of the breast cancer cases in the NOWAC cohort, the women in the CAMO cohort have somewhat lower age at first birth (23.8 years vs. 24.4 years, p = 0.02), lower age at menarche (13.1 years vs. 13.3 years, p = 0.002), higher BMI (24.8 kg/m2 vs. 24.3 kg/m2, p = 0.03), reported drinking less alcohol (3.3 g/day vs. 4.2 g/day, p<0.001), and were less likely to report a positive history of breast cancer in mothers (5.2% vs. 9.5%, p = 0.006) and sisters (2.1% vs. 5.3%, p = 0.01). Finally, 63.2% of all deaths that occurred during follow-up in the CAMO cohort were breast cancer related, compared to 61.8% in the NOWAC cohort, however, the difference was not significant (p = 0.83).

Table 2. Comparison of selected characteristics between the study populations of CAMO (n = 388) and NOWAC breast cancer cases (n = 10 356).

Characteristics at study enrolment CAMO cohort NOWAC breast cancer cases p-value
Participants at baseline, n (%) 388 10 356
Age at diagnosis (y), mean (SD) 55.1 (7.0) 57.6 (9.7) <0.001
Smoking status at baseline, n (%)
 Never 128 (33.2) 3 435 (33.6) 0.82
 Former 146 (37.8) 3 710 (36.3)
 Current 112 (29.0) 3 078 (30.1)
Duration of education (y), mean (SD) 12.3 (3.5) 12.5 (3.5) 0.23
Body mass index, mean (SD) 24.8 (4.7) 24.3 (3.8) 0.03
Physical activity level, n (%)
 Low 98 (28.1) 2 549 (26.8) 0.31
 Moderate 155 (44.4) 3 987 (41.9)
 High 96 (27.5) 2 986 (31.4)
Alcohol consumption (g/day), mean (SD) 3.3 (4.5) 4.2 (5.7) <0.001
Number of children, mean (SD) 2.2 (1.1) 2.1 (1.2) 0.12
Age at first birth (y), mean (SD) 23.8 (4.6) 24.4 (4.6) 0.02
Age at menarche (y), mean (SD) 13.1 (1.3) 13.3 (1.4) 0.002
Ever use of oral contraceptives at baseline, n (%) 229 (61.1) 5 648 (56.7) 0.09
Use of hormone replacement therapy at baseline, n (%)
 Never 223 (65.8) 3 729 (62.8) 0.063
 Former 31 (9.1) 809 (13.6)
 Current 85 (25.1) 1 405 (23.6)
Maternal history of breast cancer, n (%) 19 (5.2) 894 (9.5) 0.006
Sister history of breast cancer, n (%) 7 (2.1) 407 (5.3) 0.01
Menopausal status at diagnosis, n (%)
 Pre 120 (30.9) 3 186 (30.7) 0.91
 Post 268 (69.1) 7 201 (69.3)
Cause of death, n (%)
 Breast cancer 36 (63.2) 1222 (61.8) 0.83
 Non-breast cancer 21(36.8) 757 (38.3)

SD: standard deviation

Histological grade 2 was the most frequent tumor grade, observed in 42.5% of the breast tumors, and the median tumor diameter was 15 mm (Table 3). The most frequent molecular subtypes were luminal A (58.0%) followed by luminal B (19.9%). More women underwent lumpectomy (67.3%) compared to primary mastectomy (32.2%), and one participant did not undergo any surgery. At diagnosis, most women did not have lymph node metastases (68.6%). The number of lymph node metastases at diagnosis ranged from 0 to 32, with a median value of zero. Up until the data collection for the clinical database started in 2017, only 10.8% of cancers had metastasized to distant sites, whereas 8% of cancers had relapsed. Most tumors were hormone receptor positive: 81.3% were ER positive and 62.6% were PR positive. 12.9% of breast cancers were HER2 positive. Radiation was the most common adjuvant therapy (73.7%), followed by anti-estrogen (52.8%) and chemotherapy (37.9%). Only 6.4% of the participants received anti-HER2 therapy. We observed the highest number of missing values in the following variables: molecular subgroup (5.7%), PR status (4.9%), and HER2 status (4.6%).

Table 3. Clinical characteristics of the participants in the CAMO cohort (n = 388).

Characteristics Estimate Missing, n (%)
Tumor diameter in millimeters, median (range) 15 (0.1–92) 3 (0.8)
Tumor grade, n (%) 7 (1.8)
 1 121 (31.2)
 2 165 (42.5)
 3 95 (24.5)
Molecular subgroup, n (%) 22 (5.7)
 Luminal A 225 (58.0)
 Luminal B 77 (19.9)
 HER2+ 18 (4.6)
 Basal-like 46 (11.9)
Surgery, n (%) 1 (0.3)
 None 1 (0.3)
 Lumpectomy 261 (67.3)
 Mastectomy 125 (32.2)
Lymph node metastasis, n (%) 3 (0.8)
 Yes 119 (30.7)
 No 266 (68.6)
Number of lymph node metastases, median (range) 0 (0–32) 4 (1.0)
Distant metastasis*, n (%) 3 (0.8)
 Yes 42 (10.8)
 No 343 (88.4)
Relapse, n (%) 2 (0.5)
 Yes 31 (8.0)
 No 355 (91.5)
ER positive, n (%) 3 (0.8)
 Yes 313 (81.3)
 No 72 (18.7)
PR positive, n (%) 19 (4.9)
 Yes 243 (62.6)
 No 126 (32.5)
HER2 positive, n (%) 18 (4.6)
 Yes 50 (12.9)
 No 320 (82.5)
Adjuvant treatment, n (%)
 Radiation 286 (73.7) 7 (1.8)
 Chemotherapy 147 (37.9) 9 (2.3)
 Anti-estrogen 205 (52.8) 8 (2.1)
 Anti-HER2 25 (6.4) 5 (1.3)

ER: estrogen receptor, HER2: human epidermal growth factor receptor 2, PR: progesterone receptor,

SD: standard deviation

*Distant metastasis at any time point

Discussion

Nested within the NOWAC cohort, the CAMO cohort of 388 female breast cancer patients lays the foundation for integrative study designs. By merging epidemiological, clinical, molecular, and multi-omics data retrieved from national registries, questionnaires, medical records, blood samples and tissue biopsies, the CAMO cohort enables a systems epidemiology approach to breast cancer.

The CAMO cohort includes a set of already analyzed data for several molecular markers and omics profiles, from both blood samples and tumors. Molecular markers in tumors have been analyzed for the entire cohort, whereas a core of approximately 100 samples is approaching the full systems epidemiology potential. In this core subsample, the following omics data is available, in addition to the molecular tumor markers: whole-blood mRNA gene expression profiles, miRNA profiles and metabolomics in plasma, and miRNA profiles in tumors.

To date, our published findings on these 100 samples have been based on the tumor material and have focused on single markers or profiles. We identified significantly different miRNA profiles between malignant and normal breast tissue, and between cancer subgroups according to ER status, tumor grade and molecular subtype [16]. miRNAs in the miR‑17‑92 cluster and miR‑17 family were overexpressed in high grade and triple‑negative tumors. Further, we showed that the expression of miR-143 and miR-145 was lower in malignant compared to normal breast tissue, and lower in the more aggressive tumors with higher tumor grade, loss of ER and the basal-like phenotype [17].

From the blood sampling in the NOWAC Post-genome cohort, single-omics data have been published in combination with use of data from the National Cancer Registry. In prospective analyses, we identified time-dependent changes of the blood transcriptome up to 8 years before breast cancer diagnosis [18], and found potentially wide-reaching differences in blood gene expression profiles between metastatic and non-metastatic breast cancer cases up to two years before diagnosis [19]. In blood samples taken after the breast cancer diagnosis, a transient increase in the number of differentially expressed genes was identified at 3–4 years after diagnosis, but only in patients who later died [20]. Along the same lines, we provided a proof of concept for the use of blood gene expression profiles as biomarkers of death from metastatic breast cancer [21].

We have explored associations of blood gene expression profiles and several breast cancer related lifestyle and dietary exposures. For example, we showed that each pregnancy changes blood gene expression profiles in a linear fashion relative to the number of children. However, this was only found in healthy women, and not in women who later developed breast cancer [22]. Several other lifestyle and dietary factors also impact blood gene expression, ultimately affecting the molecular and immunological processes potentially involved in breast cancer. These include smoking [23], HRT [24], dietary fatty acids [25], vitamin D [26] and coffee consumption [27].

Collectively, the studies published to date based on either the CAMO or the NOWAC Post-genome cohort demonstrate the potential of the available data to provide new insight into cancer development and progression, and its potential to contribute to precision medicine. We have demonstrated the importance of incorporating questionnaire data with molecular markers and profiles, and the importance of the clinical data for patient stratification. Also, the time factor is essential when studying the trajectories of molecular profiles before, at, and after diagnosis. In CAMO, serial sampling of biological specimens was not carried out. To account for this, we have approached the time issue using group level data, exploiting the randomness of the length of follow-up time due to the study design of the NOWAC Post-genome cohort.

Future use of the data from the CAMO cohort will move into multi-level analyses, by incorporating data from both blood and tumor with data from questionnaires, whole slide images, lymph node analyses, multiple omics profiles, and molecular markers, all measured at different time points in the trajectory going from the healthy to the disease state. These data will be combined with information from the clinical database, and end point registries (Fig 3). The multiple dimensions that can be combined for each possible study design has been described elsewhere [28]. Briefly, the combination of the following dimensions will create a multitude of possible study designs: time, exposures, measurements, diagnosis, participant selection, sample types, as well as stratification and de-confounding. The process of designing each study will be stepwise, by defining content to each dimension, and making choices relevant to the study question at hand. Several challenges accompany each dimension of this systems epidemiology and precision medicine approach. Challenges include the risk of systematic errors, the adequate collection and preservation of biological samples, building computer system architectures for data management and analysis, as well as the need for advanced analytical approaches for dimension reduction, blockwise missingness of data, data pattern exploration, and causal inference.

The main strengths of the CAMO cohort include its prospective design, the relatively large sample size compared to other similar cohorts, the large number of variables available from the clinical database, and the NOWAC questionnaires. The use of follow-up questionnaires allows research questions aiming to explore associations between pre- and post-diagnostic factors and clinical outcomes, analyzing effects of changes in exposures, and attenuates the risk of measurement error. The NOWAC study was previously externally validated by Lund et al. 2003, revealing no major source of selection bias [29]. The results presented here indicate that there are no substantial differences between the CAMO cohort and the rest of the breast cancer cases in the NOWAC study, other than alcohol consumption and family history of breast cancer. Although we found statistically significant differences in other characteristics such as age at diagnosis, BMI, age at first birth, and age at menarche, these differences are clinically negligible. This, along with the fact that all breast cancer cases from the NOWAC Post-genome cohort living in North Norway were the basis of inclusion in the CAMO cohort, ensures high external validity.

Due to the size of the CAMO cohort, we might lack statistical power to assess associations of less frequent biomarkers, exposures, and outcomes. A certain degree of misclassification is present due to the self-reporting of questionnaire data. For some of the women in our cohort, a large gap between the time of questionnaire data collection and the point of diagnosis may also contribute to misclassification. Finally, data collection from clinical and medical databases and records poses several issues. Despite best efforts, this type of data extraction is non-automatic, and prone to technical and human error due to the handling of very large data sets. Other data sources for verification and validation of the extracted data are not available.

Conclusion

The availability of biological samples from study participants is a prerequisite in systems epidemiological studies. Experiments using tissue from study participants enable investigation of molecular marker profiles resulting from complex real-life situations, potentially revealing how various exposures may affect disease characteristics, development, and progression. The CAMO cohort, nested within the NOWAC study, provides multi-dimensional data from a wide range of sources. These data may be combined in various ways to create a multitude of complex study designs, allowing investigation of a great variety of hypotheses related to breast cancer prevention, diagnostics, treatment, and survival. Using a systems epidemiology approach, we may increase our understanding of biological pathways in breast cancer as time- and exposure dependent trajectories and investigate how these trajectories differ by clinical strata. This, in turn, lays the groundwork for discovery of new diagnostic, prognostic and predictive biomarkers that may facilitate early diagnosis, support clinical decision-making, and improve the precision of interventions.

Supporting information

S1 Appendix. Analyses of blood samples and tumor tissue.

(DOCX)

Acknowledgments

We are grateful to the NOWAC participants for contributing their information, blood, and tissue samples. We also acknowledge the invaluable help from technical and administrative staff in NOWAC, and from Kari Wagelid Grønn who contributed with graphic design of the figures.

Disclaimer: Some of the data in this work are from The Cancer Registry of Norway. The Cancer Registry of Norway is not responsible for the analysis or interpretation of the data.

Abbreviations

BMI

Body mass index

CAMO

Clinical- and Multi-omic

DPA

Data Protection Authority

ER

Estrogen receptor

FFPE

Formalin-fixed paraffin-embedded

GCF

Genomics Core Facility

HE

Hematoxylin and eosin

HER2

Human epidermal growth factor receptor 2

HRT

Hormone replacement therapy

IHC

Immunohistochemistry

LC

Liquid chromatography

MS

Mass spectrometry

NOWAC

Norwegian Woman and Cancer

OC

Oral contraceptives

PR

Progesterone receptor

SISH

Silver in situ hybridization

TMA

Tissue microarray

Data Availability

The questionnaire and registry data are stored and managed by the NOWAC research group at the Department of Community Medicine, UiT The Arctic University of Norway, Tromsø, Norway. Blood samples collected as part of the NOWAC Post-genome cohort (plasma, buffy coat, and PAXgene Blood RNA samples) are kept at the UiT Core Facility for Biobanking. The clinical sample material and the clinical database are kept and managed by the Department of Clinical Pathology, University Hospital of North Norway, Tromsø, Norway. Due to the sensitivity of the data that has been collected in the CAMO cohort, both from the data material collected in NOWAC and the medical records as well as from the linkage to the Cancer Registry, the data cannot be placed in a public repository. Data cannot be shared publicly since the data is of a detailed and sensitive nature. However, data will be available upon request to NOWAC at nowac@uit.no.

Funding Statement

This study was funded by UiT The Arctic University of Norway, and the NOWAC study was supported by a grant from the European Research Council (ERC-AdG 232997 TICE). The publication charges for this article have been funded by a grant from the publication fund of UiT The Arctic University of Norway. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

  • 1.WHO. Breast cancer now most common form of cancer: WHO taking action www.who.int: WHO; 2021 [updated March 2 2021; cited 2021 12. july]. https://www.who.int/news/item/03-02-2021-breast-cancer-now-most-common-form-of-cancer-who-taking-action.
  • 2.GCO. Estimated number of deaths in 2020, worldwide, female, all ages: IARC; 2020 [cited 2021 12. july]. https://gco.iarc.fr/today/online-analysis-pie?v=2020&mode=cancer&mode_population=continents&population=900&populations=900&key=total&sex=2&cancer=39&type=1&statistic=5&prevalence=0&population_group=0&ages_group%5B%5D=0&ages_group%5B%5D=17&nb_items=7&group_cancer=1&include_nmsc=1&include_nmsc_other=1&half_pie=0&donut=0.
  • 3.GCO. Breast gco.iarc.fr: IARC; 2020 [cited 2021 12. july]. https://gco.iarc.fr/today/data/factsheets/cancers/20-Breast-fact-sheet.pdf.
  • 4.ACS. Survival Rates for Breast Cancer Cancer.org: The American Cancer Society; updated January 27, 2021. https://www.cancer.org/cancer/breast-cancer/understanding-a-breast-cancer-diagnosis/breast-cancer-survival-rates.html#references.
  • 5.Jameson JL, Longo DL. Precision medicine—personalized, problematic, and promising. N Engl J Med. 2015;372(23):2229–34. doi: 10.1056/NEJMsb1503104 [DOI] [PubMed] [Google Scholar]
  • 6.Laszlo A, Krippner S. Chapter 3. Systems Theories: Their Origins, Foundations, and development. Advances in Psychology. 1998;126. [Google Scholar]
  • 7.Dammann O, Gray P, Gressens P, Wolkenhauer O, Leviton A. Systems Epidemiology: What’s in a Name? Online J Public Health Inform. 2014;6(3):e198–e. doi: 10.5210/ojphi.v6i3.5571 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Lund E, Dumeaux V. Systems Epidemiology in Cancer. Cancer Epidemiology Biomarkers &amp; Prevention. 2008;17(11):2954. doi: 10.1158/1055-9965.EPI-08-0519 [DOI] [PubMed] [Google Scholar]
  • 9.Lund E, Dumeaux V, Braaten T, Hjartåker A, Engeset D, Skeie G, et al. Cohort Profile: The Norwegian Women and Cancer Study—NOWAC—Kvinner og kreft. International Journal of Epidemiology. 2008;37(1):36–41. doi: 10.1093/ije/dym137 [DOI] [PubMed] [Google Scholar]
  • 10.Dumeaux V, Børresen-Dale A-L, Frantzen J-O, Kumle M, Kristensen VN, Lund E. Gene expression analyses in breast cancer epidemiology: the Norwegian Women and Cancer postgenome cohort study. Breast Cancer Research. 2008;10(1):R13. doi: 10.1186/bcr1859 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Elston CW, Ellis IO. Pathological prognostic factors in breast cancer. I. The value of histological grade in breast cancer: experience from a large study with long-term follow-up. Histopathology. 1991;19(5):403–10. doi: 10.1111/j.1365-2559.1991.tb00229.x [DOI] [PubMed] [Google Scholar]
  • 12.Coates AS, Winer EP, Goldhirsch A, Gelber RD, Gnant M, Piccart-Gebhart M, et al. Tailoring therapies—improving the management of early breast cancer: St Gallen International Expert Consensus on the Primary Therapy of Early Breast Cancer 2015. Ann Oncol. 2015;26(8):1533–46. doi: 10.1093/annonc/mdv221 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Vasconcelos I, Hussainzada A, Berger S, Fietze E, Linke J, Siedentopf F, et al. The St. Gallen surrogate classification for breast cancer subtypes successfully predicts tumor presenting features, nodal involvement, recurrence patterns and disease free survival. Breast. 2016;29:181–5. doi: 10.1016/j.breast.2016.07.016 [DOI] [PubMed] [Google Scholar]
  • 14.Bremnes RM, Veve R, Gabrielson E, Hirsch FR, Baron A, Bemis L, et al. High-throughput tissue microarray analysis used to evaluate biology and prognostic significance of the E-cadherin pathway in non-small-cell lung cancer. J Clin Oncol. 2002;20(10):2417–28. doi: 10.1200/JCO.2002.08.159 [DOI] [PubMed] [Google Scholar]
  • 15.Tellez-Gabriel M, Tekpli X, Reine TM, Hegge B, Nielsen SR, Chen M, et al. Serglycin is Involved in TGF-β Induced Epithelial-Mesenchymal Transition and Is Highly Expressed by Immune Cells in Breast Cancer Tissue. Front Oncol 2022;12. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Moi L, Braaten T, Al-Shibli K, Lund E, Busund L-T. Differential expression of the miR-17-92 cluster and miR-17 family in breast cancer according to tumor type; results from the Norwegian Women and Cancer (NOWAC) study. Journal of Translational Medicine. 2019;17. doi: 10.1186/s12967-019-2086-x [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Johannessen C, Moi L, Kiselev Y, Pedersen MI, Dalen SM, Braaten T, et al. Expression and function of the miR-143/145 cluster in vitro and in vivo in human breast cancer. PLoS One. 2017;12(10):e0186658. doi: 10.1371/journal.pone.0186658 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Holden M, Holden L, Olsen K, Lund E. Local in Time Statistics for detecting weak gene expression signals in blood–illustrated for prediction of metastases in breast cancer in the NOWAC Post-genome Cohort. Advances in Genomics and Genetics. 2017;7:11–28. [Google Scholar]
  • 19.Holsbø E, Olsen KS. Metastatic Breast Cancer and Pre-Diagnostic Blood Gene Expression Profiles-The Norwegian Women and Cancer (NOWAC) Post-Genome Cohort. Front Oncol. 2020;10:575461-. doi: 10.3389/fonc.2020.575461 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Olsen KS, Holden M, Thalabard JC, Rasmussen Busund LT, Lund E, Holden L. Global blood gene expression profiles following a breast cancer diagnosis-Clinical follow-up in the NOWAC post-genome cohort. PLoS One. 2021;16(3):e0246650. doi: 10.1371/journal.pone.0246650 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Lund E, Holden M, Thalabard J-C, Busund L-T, Snapkov I, Holden L. 9. Signals of Death—Post-Diagnostic Single Gene ExpressionTrajectories in Breast Cancer—A Proof of Concept: Exploring Trajectories of Gene Expression. 2020. [Google Scholar]
  • 22.Lund E, Nakamura A, Snapkov I, Thalabard JC, Olsen KS, Holden L, et al. Each pregnancy linearly changes immune gene expression in the blood of healthy women compared with breast cancer patients. Clin Epidemiol. 2018;10:931–40. doi: 10.2147/CLEP.S163208 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Baiju N, Sandanger TM, Sætrom P, Nøst TH. Gene expression in blood reflects smoking exposure among cancer-free women in the Norwegian Women and Cancer (NOWAC) postgenome cohort. Sci Rep. 2021;11(1):680. doi: 10.1038/s41598-020-80158-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Waaseth M, Olsen KS, Rylander C, Lund E, Dumeaux V. Sex hormones and gene expression signatures in peripheral blood from postmenopausal women—the NOWAC postgenome study. BMC Med Genomics. 2011;4:29. doi: 10.1186/1755-8794-4-29 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Olsen KS, Fenton C, Frøyland L, Waaseth M, Paulssen RH, Lund E. Plasma fatty acid ratios affect blood gene expression profiles—a cross-sectional study of the Norwegian Women and Cancer Post-Genome Cohort. PLoS One. 2013;8(6):e67270. doi: 10.1371/journal.pone.0067270 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Standahl Olsen K, Rylander C, Brustad M, Aksnes L, Lund E. Plasma 25 hydroxyvitamin D level and blood gene expression profiles: A cross-sectional study of the Norwegian Women and Cancer Post-genome Cohort. European journal of clinical nutrition. 2013;67. doi: 10.1038/ejcn.2013.53 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.R BB, T HN, Ulven SM, Skeie G, K SO. Coffee Consumption and Whole-Blood Gene Expression in the Norwegian Women and Cancer Post-Genome Cohort. Nutrients. 2018;10(8). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Arnes J, Bongo L. 2. The Beauty of Complex Designs: Exploring Trajectories of Gene Expression. 2020. [Google Scholar]
  • 29.Lund E, Kumle M, Braaten T, Hjartåker A, Bakken K, Eggen E, et al. External validity in a population-based national prospective study—the Norwegian Women and Cancer Study (NOWAC). Cancer Causes Control. 2003;14(10):1001–8. doi: 10.1023/b:caco.0000007982.18311.2e [DOI] [PubMed] [Google Scholar]

Decision Letter 0

Alvaro Galli

22 Nov 2022

PONE-D-22-25688Cohort profile: The Clinical and Multi-omic (CAMO) cohort, part of the Norwegian Women and Cancer (NOWAC) studyPLOS ONE

Dear Dr. Berli Delgado,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please submit your revised manuscript by Jan 06 2023 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols.

We look forward to receiving your revised manuscript.

Kind regards,

Alvaro Galli

Academic Editor

PLOS ONE

Journal Requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and 

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2. Thank you for stating the following in the Acknowledgments Section of your manuscript: 

"We are grateful to the NOWAC participants for contributing their information, blood, and tissue samples. We also acknowledge the invaluable help from technical and administrative staff in NOWAC, and from Kari Wagelid Grønn who contributed with graphic design of the figures."

We note that you have provided funding information that is not currently declared in your Funding Statement. However, funding information should not appear in the Acknowledgments section or other areas of your manuscript. We will only publish funding information present in the Funding Statement section of the online submission form. 

Please remove any funding-related text from the manuscript and let us know how you would like to update your Funding Statement. Currently, your Funding Statement reads as follows: 

"This study was funded by UiT The Arctic University of Norway, and the NOWAC study was supported by a grant from the European Research Council (ERC-AdG 232997 TICE). The publication charges for this article have been funded by a grant from the publication fund of UiT The Arctic University of Norway. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript."

Please include your amended statements within your cover letter; we will change the online submission form on your behalf.

3. Please provide additional details regarding participant consent. In the ethics statement in the Methods and online submission information, please ensure that you have specified what type you obtained (for instance, written or verbal, and if verbal, how it was documented and witnessed). If your study included minors, state whether you obtained consent from parents or guardians. If the need for consent was waived by the ethics committee, please include this information.

Once you have amended this/these statement(s) in the Methods section of the manuscript, please add the same text to the “Ethics Statement” field of the submission form (via “Edit Submission”).

For additional information about PLOS ONE ethical requirements for human subjects research, please refer to http://journals.plos.org/plosone/s/submission-guidelines#loc-human-subjects-research.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: No

Reviewer #3: Partly

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: N/A

Reviewer #3: Yes

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: No

Reviewer #3: No

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: The authors describe a resource they have developed nested within the Norwegian Women and Cancer (NOWAC) study. The subset described is of 388 breast cancer patients on whom epidemiologic, genomic and breast cancer pathology and treatment data have been gathered. This will be a rich resource for performing "multi-level analyses" at different time points in healthy to disease states to increase understanding of breast cancer development and progression.

Some minor comments:

Why was >10% used as threshold for PR positivity?

Would the data be altered if the current HER2 ratio of 2.0 be used for HER2 amplification?

Is it correct that the dates of the first follow up are 1998-2014? (vs. 1998-2004)?

Given the relatively small size of the cohort (in particular of HER2 enriched and basal-like subtypes), it may be that "support of clinical decision-making" and "precision interventions" is rather more aspirational, and that the study set will be better utilized for hypothesis generation for these types of management goals in women with luminal A-like breast cancers.

Reviewer #2: Delgado and co-workers present a kind of review article that describes some clinical and molecular features of the CAMO patient cohort, and some results the authors had obtained using that cohort. CAMO is part of the Norwegian Women and Cancer study (NOWAC). However, while the former covers just 388 patients, NOWAC has over 10,000 participants.

In the manuscript, some clinical features of the two cohorts are compared and most of them are not substantially different. The authors list a number of mostly own publications that have been published using the CAMO data. Unfortunately, no information is provided how others might make use of that resource (Data availability is annotated as: No – some restrictions will apply). In lines 235-237 the authors write that ‘These multimodal data may be combined in various ways to create complex study designs that can be used to investigate hypotheses related to breast cancer prevention, diagnostics, treatment, and survival’. While this statement may hold, readers would not be able to realize these promises unless they had access to the data. The utility of the CAMO data and of the manuscript are thus limited, particularly along the envisioned lines of epidemiological, clinical, molecular, and multi-omics investigations. Its use towards systems epidemiology in translational breast cancer research seems restricted to the authors and, potentially, their direct collaborators. At least 12 of the 28 references are self-citations (mostly author Lund).

The number of 388 cases in the CAMO appears to be low, given the complexity of breast cancer (subtypes) and of other parameters that are described in the manuscript. Having recruited patients into the study over a period of many years suggests that treatment regimen – intentionally affecting patient outcome – have changed and improved outcome (compare Figure 2). Data from TNBC patients having been treated, for instance with paclitaxel or platinum drugs should be hard to combine with that of patients having received other regimen in earlier years. These issues might impact significance of potential findings. It remains unclear from the manuscript why only a small subset of the NOWAC cohort was selected for CAMO, thereby not leveraging the full potential NOWAC likely has.

Other comments.

Some of the technologies used to collect data are not up-to-date. For example, gene expression profiling data had been collected using Illumina array technology that was discontinued some years ago. It is unclear if that data could be combined with state-of-the-art sequencing data.

In lines 202 and following, the generation of tissue microarrays is described. However, it is not clear how these TMAs have been used and no new findings are described. Instead, the authors write that an assessment of a wide range of potential biomarkers using TMAs is planned or currently ongoing (lines 215-216). Description of these TMAs would be most useful if data was presented having been collected using these tools.

The authors performed some statistical analysis of parameters in NOWAC vs. CAMO studies. For example, they found that the mean age was significantly different. While that in CAMO was 55.1 years, it was 57.6 in NOWAC (p<0.001). Significant differences were found also for e.g., the age at first birth (23.8 vs. 24.4 years, p=0.02), age at menarche, and reported drinking of alcohol. While these differences may be significant, are they also relevant? The numbers are reported in the text and repeated in the Table 2.

In lines 288 and following, some molecular analysis is described, however, no data is presented on molecular markers that would have been identified in a core of approximately 100 samples that is stated to approach the full systems epidemiological potential. The description of novel data and its relevance in breast cancer is absent.

This is a study of limited interest.

Reviewer #3: The manuscript “Cohort profile: The Clinical and Multi-omic (CAMO) cohort, part of the Norwegian Women and Cancer (NOWAC) study” is basically a description of study design, data collection, and representativeness of the breast cancer cases of CAMO compared to the larger NOWAC study. The data collection is surely very unique .

My suggestions to improve readability and clarity of the manuscript, and to focus on what is really the aim of this manuscript are the following (in order of the manuscript):

In the manuscript the CAMO cohort includes women with breast cancer, please clarify if this is only invasive or also DCIS.

Were all women included before or after a first primary breast cancer? How was dealt with women that had another cancer before their breast cancer or were there none?

The figures are quite informative. However, Figure 2 was not immediately clear to me. It would help to add the information related to A-E directly in text boxes with the figure.

Is each dot 5 women? Then I think I only count 380 women?

It is also not visible in the figure which proportion of women had second follow-up is it possible to somehow visualize this? I would otherwise refer to table 1 (see next comment).

In table 1 add the median (range) time between baseline and first and second follow-up. This information is especially needed given the claims in the discussion about the value of having a cohort with different times between cancer-free and after cancer diagnosis.

Page 6 line 129/130. It is a pity that less than half of the women provided a blood sample before breast cancer diagnosis, more incident breast cancers would have been valuable. How does this proportion of incident and prevalent breast cancer (in relation to the blood sampling) in CAMO relates to NOWAC?

Page 7 line 145 “self-reporting of family history of breast cancer (mother, sister),”

Please clarify. Was indeed only family history of mothers and sister reported, not of all first-degree relatives?

Page 8 “To ensure diagnostic quality, we reanalyzed ER and PR for all breast cancers diagnosed before 2001”

Which proportion of the data does this concern? There will also still be substantial variation between hospitals/ laboratories, so why did the authors not re-analyze receptors for all samples? For those that were not stained again, were these re-scored by one pathologist?

Page 14 discussion: “Molecular markers in tumors have been analyzed for the entire cohort, whereas a core of approximately 100 samples is approaching the full systems epidemiology potential.”

Where can this number be found back in the text or tables? Figure 3 and tables 1-3 suggest that all data is available for almost all breast cancer cases. Does this only refer to the part on which there is also omics data? I strongly suggest including a table with omics information in the results section (not necessarily results of analyses, but at least showing which data is available for how many samples). It would also be informative to show how data availability is distributed over core breast cancer subgroups e.g. pre- versus post-menopausal and subtypes (at least ER positive versus negative).

The discussion is overall quite long, would consider reducing the parts on own previous results and on multiple dimensions. Related to previous results, a burning question would be how the CAMO dataset can help with the next steps, this is not really addressed.

One of the most important pieces of information, which is basically table 2 only appears late in the discussion (p17 row 350-356), suggest moving this earlier in the discussion.

The data collection is very unique, but still the sample set is small for multilevel analyses. Perhaps the discussion should be tuned down a bit on the promises for the future.

“In CAMO, serial sampling of biological specimens was not carried out. To account for this, we have approached the time issue using group level data, exploiting the randomness of the length of follow-up time due to the study design of the NOWAC Post genome cohort.”

Indeed, the lack of serial sampling may be mentioned as a shortcoming. However, the current manuscript does not provide any support or evidence in the shown tables or figures that this short coming is covered by the study design and sampling over a longer period. Frankly I would debate that this is possible unless the study would be very large and the distribution of patient and tumor characteristics well balanced over time between samples and diagnosis of cancer.

“The NOWAC study was previously externally validated (28).” Please explain better, not clear what is meant here.

Appendix. Correct typo page 1 “as as”

The appendix explains the scoring of the TMAs but not of the original and newly stained whole slides for ER, PR, HER2.

This sentence in the Appendix “Each researcher gives three scores to each core, reflecting staining density and/or intensity in stroma, tumor cytoplasm and tumor nucleus.” needs some rewriting. I guess not all markers stain in all areas (stroma, cytoplasm, nucleas) and is always either density (is this % of cells?) or intensity (is this vague to strong coloring?) scored?

A better explanation of how the data can be requested, even if restrictions apply, should be included in the data availability statement.

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: Yes: Stefan Wiemann

Reviewer #3: No

**********

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2023 Feb 6;18(2):e0281218. doi: 10.1371/journal.pone.0281218.r002

Author response to Decision Letter 0


9 Jan 2023

Issues raised by the Editor

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming.

Response: Thank you for pointing this out, files have been updated according to the style requirements.

2. Thank you for stating the following in the Acknowledgments Section of your manuscript:

"We are grateful to the NOWAC participants for contributing their information, blood, and tissue samples. We also acknowledge the invaluable help from technical and administrative staff in NOWAC, and from Kari Wagelid Grønn who contributed with graphic design of the figures."

We note that you have provided funding information that is not currently declared in your Funding Statement. However, funding information should not appear in the Acknowledgments section or other areas of your manuscript. We will only publish funding information present in the Funding Statement section of the online submission form.

Please remove any funding-related text from the manuscript and let us know how you would like to update your Funding Statement. Currently, your Funding Statement reads as follows:

"This study was funded by UiT The Arctic University of Norway, and the NOWAC study was supported by a grant from the European Research Council (ERC-AdG 232997 TICE). The publication charges for this article have been funded by a grant from the publication fund of UiT The Arctic University of Norway. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript."

Please include your amended statements within your cover letter; we will change the online submission form on your behalf.

Response: We have reviewed the manuscript, acknowledgements and funding statement, and we cannot identify any mistakes. Neither the Acknowledgments nor any other parts of the manuscript include any statement regarding funding.

3. Please provide additional details regarding participant consent. In the ethics statement in the Methods and online submission information, please ensure that you have specified what type you obtained (for instance, written or verbal, and if verbal, how it was documented and witnessed). If your study included minors, state whether you obtained consent from parents or guardians. If the need for consent was waived by the ethics committee, please include this information.

Response: Thank you for pointing out this lack of precision, we have now added the required information in our Ethics statement in the online submission. We have also added the ethics statement in the manuscript under the “Material and methods”-section, line 231-243. 

Reviewer 1

The authors describe a resource they have developed nested within the Norwegian Women and Cancer (NOWAC) study. The subset described is of 388 breast cancer patients on whom epidemiologic, genomic and breast cancer pathology and treatment data have been gathered. This will be a rich resource for performing "multi-level analyses" at different time points in healthy to disease states to increase understanding of breast cancer development and progression. Some minor comments:

Response: We highly appreciate the positive comment from Reviewer 1.

“Why was >10% used as threshold for PR positivity?”

Response: Progesterone receptor analyses have been performed according to national Norwegian guidelines. As recommended in national guidelines, progesterone receptors are analyzed by immunohistochemistry in FFPE tissue biopsies at the time of diagnosis and ≥ 10 % used as threshold for positivity. We have included the phrase “..as recommended in the Norwegian national guidelines” on p. 8, line 171, to clarify this issue in the text.

“Would the data be altered if the current HER2 ratio of 2.0 be used for HER2 amplification?”

Response: We highly appreciate the reviewer’s attention to important details regarding classification of the tumors. The results of all HER2 in situ hybridization (SISH) have been checked, and the classification would not be altered due to the change in HER2/chromosome 17-ratio cut-off from 2,2 to 2,0. We have updated the manuscript p. 8, line 175-176, to «…HER2/chromosome 17-ratio ≥ 2.0» which is also in line with the current national guidelines regarding HER2-evaluation in breast cancer.

“Is it correct that the dates of the first follow up are 1998-2014? (vs. 1998-2004)?”

Response: Yes, the first follow-up was carried out in waves during the years 1998-2014. For example, some women enrolled in 1991 were invited to a first follow-up in 1998, whereas some women enrolled 2004 were invited for their follow-up in 2014.

“Given the relatively small size of the cohort (in particular of HER2 enriched and basal-like subtypes), it may be that "support of clinical decision-making" and "precision interventions" is rather more aspirational, and that the study set will be better utilized for hypothesis generation for these types of management goals in women with luminal A-like breast cancers.”

Response: We appreciate and agree with the reviewer’s comment. However, further studies using the CAMO cohort can generate hypothesis that can be tested by larger cohorts, which could support clinical decision-making, also in smaller cancer subgroups. 

Reviewer 2

Delgado and co-workers present a kind of review article that describes some clinical and molecular features of the CAMO patient cohort, and some results the authors had obtained using that cohort. CAMO is part of the Norwegian Women and Cancer study (NOWAC). However, while the former covers just 388 patients, NOWAC has over 10,000 participants.

In the manuscript, some clinical features of the two cohorts are compared and most of them are not substantially different. The authors list a number of mostly own publications that have been published using the CAMO data. Unfortunately, no information is provided how others might make use of that resource (Data availability is annotated as: No – some restrictions will apply). In lines 235-237 the authors write that ‘These multimodal data may be combined in various ways to create complex study designs that can be used to investigate hypotheses related to breast cancer prevention, diagnostics, treatment, and survival’. While this statement may hold, readers would not be able to realize these promises unless they had access to the data. The utility of the CAMO data and of the manuscript are thus limited, particularly along the envisioned lines of epidemiological, clinical, molecular, and multi-omics investigations. Its use towards systems epidemiology in translational breast cancer research seems restricted to the authors and, potentially, their direct collaborators. At least 12 of the 28 references are self-citations (mostly author Lund).

Response: We respectfully acknowledge the viewpoint of the reviewer; however, we would like to take the opportunity to highlight the hallmarks of a cohort profile-type paper. Cohort profile papers are designed to fill the space between a study protocol and a research paper. The main motivation of a cohort profile paper is to describe the rationale for assembling the cohort, what methods have been used, the baseline data describing the cohort, and its future plans. We aimed for a very clear referral to such papers, as “cohort profile” is a part of our title. We would argue that such papers are within the scope of PLOS ONE, which we have selected as our primary journal choice due to its broad readership from both the cancer research and computational/statistical modeling side.

The number of 388 cases in the CAMO appears to be low, given the complexity of breast cancer (subtypes) and of other parameters that are described in the manuscript. Having recruited patients into the study over a period of many years suggests that treatment regimen – intentionally affecting patient outcome – have changed and improved outcome (compare Figure 2). Data from TNBC patients having been treated, for instance with paclitaxel or platinum drugs should be hard to combine with that of patients having received other regimen in earlier years. These issues might impact significance of potential findings. It remains unclear from the manuscript why only a small subset of the NOWAC cohort was selected for CAMO, thereby not leveraging the full potential NOWAC likely has.

Response: We appreciate the reviewer pointing out this limitation. In the CAMO-cohort we have registered data on treatment type, both surgery, radiation therapy, endocrine therapy and chemotherapy. We have registered both which types of chemotherapy the patients have received, as well as the duration of the treatment.

The subset of 388 patients were chosen because these patients were diagnosed in the same health region in northern Norway. The tumor material where stored in two collaborating departments of clinical pathology, and where not used in any other research projects and therefore available and could be included in the CAMO cohort. Both hospitals, where the tumor material was stored, also use the same system for medical records which enabled us to carry out a thorough review of the records.

Other comments. Some of the technologies used to collect data are not up-to-date. For example, gene expression profiling data had been collected using Illumina array technology that was discontinued some years ago. It is unclear if that data could be combined with state-of-the-art sequencing data.

Response: It is certainly true that the bead array technology has been surpassed by sequencing technology for the majority of research purposes. Importantly, we do have isolated RNA in the cohort biobank, so that future research can make use of updated technology. We have added a sentence to highlight this in line 104-105.

In lines 202 and following, the generation of tissue microarrays is described. However, it is not clear how these TMAs have been used and no new findings are described. Instead, the authors write that an assessment of a wide range of potential biomarkers using TMAs is planned or currently ongoing (lines 215-216). Description of these TMAs would be most useful if data was presented having been collected using these tools.

Response: We appreciate the importance of presenting data regarding studies using the TMAs, as highlighted by the reviewer. As stated in the manuscript, this work is ongoing, and it is our aim that this manuscript presenting the cohort of breast cancers included in the TMAs would offer a detailed and accurate description of the cohort to future readers and potential collaborators. One article has been published using TMAs from the CAMO-cohort as part of a study on serglycin in breast cancer and is now referred to in the manuscript on p. 10, line 219 as an example of how the TMAs can be used.

The authors performed some statistical analysis of parameters in NOWAC vs. CAMO studies. For example, they found that the mean age was significantly different. While that in CAMO was 55.1 years, it was 57.6 in NOWAC (p<0.001). Significant differences were found also for e.g., the age at first birth (23.8 vs. 24.4 years, p=0.02), age at menarche, and reported drinking of alcohol. While these differences may be significant, are they also relevant? The numbers are reported in the text and repeated in the Table 2.

Response: We agree with the reviewer that these differences are not necessarily relevant for the cohort or future studies. We list these differences in the text and repeat them in Table 2 to specify that we have considered the potential differences between the CAMO-cohort and NOWAC.

In lines 288 and following, some molecular analysis is described, however, no data is presented on molecular markers that would have been identified in a core of approximately 100 samples that is stated to approach the full systems epidemiological potential. The description of novel data and its relevance in breast cancer is absent. This is a study of limited interest.

Response: We acknowledge the reviewers viewpoint. Here we describe that we have analyzed several molecular markers and omics profiles. Since this manuscript is a cohort profile, we do not present the results of these analyses here, as this manuscript is meant to describe the data material gathered in this cohort. In the core of 100 samples we have additional data, as listed in the manuscript. We have both epidemiological and life-style data, as well as histopathological data. Furthermore, we have stained for several molecular markers and we have omics profiles both from blood samples and tumor tissue. We believe this subset is approaching the full systems epidemiological potential due to the wide range of available data.

Reviewer 3

“The manuscript “Cohort profile: The Clinical and Multi-omic (CAMO) cohort, part of the Norwegian Women and Cancer (NOWAC) study” is basically a description of study design, data collection, and representativeness of the breast cancer cases of CAMO compared to the larger NOWAC study. The data collection is surely very unique.”

My suggestions to improve readability and clarity of the manuscript, and to focus on what is really the aim of this manuscript are the following (in order of the manuscript):

“In the manuscript the CAMO cohort includes women with breast cancer, please clarify if this is only invasive or also DCIS.”

Response: We highly appreciate that the reviewer finds the collected data to be very unique. Only women with invasive breast carcinomas are included in the CAMO cohort. To clarify this important issue, we have included the word «invasive» in the Abstract, p. 2, line 28, which now reads «..These study participants were diagnosed with invasive breast cancer…» and in the Introduction, p. 4 line 77-78, which now reads «…388 female patients with invasive breast cancer…».

“Were all women included before or after a first primary breast cancer?”

Response: All women in the CAMO-cohort were recruited to the cohort after primary breast cancer. Women in the NOWAC study were recruited regardless of cancer status at the time of recruitment.

“How was dealt with women that had another cancer before their breast cancer or were there none?”

Response: From the cancer registry we have received the first cancer diagnosis regardless of cancer type. We can thus see which type of cancer was the participants' first type of cancer, and thus distinguish those participants from the rest.

“The figures are quite informative. However, Figure 2 was not immediately clear to me. It would help to add the information related to A-E directly in text boxes with the figure.

Is each dot 5 women? Then I think I only count 380 women?”

Response: Because of the amount of information related to A-E, we are not able to add the information directly into text boxes within the figure. Due to rounding by the software used to create the figures the amount of the dots is not exact. The lack of one and a half dot is due to rounding of the number of cases per year, resulting in the loss of a dot.

“It is also not visible in the figure which proportion of women had second follow-up is it possible to somehow visualize this? I would otherwise refer to table 1 (see next comment).”

Response: Unfortunately, we are not able to clearly visualize the proportion of women who had a second follow-up in Figure 2. However, this information is presented in Table 1 following immediately after Fig. 2. In the manuscript.

“In table 1 add the median (range) time between baseline and first and second follow-up. This information is especially needed given the claims in the discussion about the value of having a cohort with different times between cancer-free and after cancer diagnosis.”

Response: We have provided the requested information in line 247-248 of the Results section.

“Page 6 line 129/130. It is a pity that less than half of the women provided a blood sample before breast cancer diagnosis, more incident breast cancers would have been valuable. How does this proportion of incident and prevalent breast cancer (in relation to the blood sampling) in CAMO relates to NOWAC?”

We highly appreciate the reviewer’s comment. Please note that we have corrected the information provided in lines 130-133 to correspond with the information in Table 1: 53.2 % of the women provided a blood sample before diagnosis. Correspondingly, in the NOWAC Post-genome cohort (n=50 000), there are 3035 BC cases, of which 1888 (62%) gave a blood sample before the BC diagnosis. We agree with the reviewer that it would be very interesting and valuable to have more pre-diagnostic blood samples. However, as described in Materials and methods, section on Study population, the NOWAC study was a prospective cohort study started in 1991 which was expanded to include blood samples in 2003-2006. The samples were collected irrespective of any previous breast cancer diagnosis, and information on diagnosis of breast cancer were identified in retrospect through linkage to the Cancer Registry of Norway. This would result in a proportion of the patients having blood samples drawn after their diagnosis.

Page 7 line 145 “self-reporting of family history of breast cancer (mother, sister),”

Please clarify. Was indeed only family history of mothers and sister reported, not of all first-degree relatives?

Response: That’s correct, only the family history of mother and sister where included.

“Page 8 “To ensure diagnostic quality, we reanalyzed ER and PR for all breast cancers diagnosed before 2001”

Which proportion of the data does this concern? There will also still be substantial variation between hospitals/ laboratories, so why did the authors not re-analyze receptors for all samples? For those that were not stained again, were these re-scored by one pathologist?”

Response: We appreciate the reviewer’s comment. As illustrated in Fig. 2, around 104 of all the breast cancers included in the cohort were diagnosed before 2001. The analyses of ER and PR were done as part of routine diagnostic evaluation at two closely collaborating pathology labs. However, all histopathological parameters included in the study have been re-evalutated by a breast pathologist (co-author L.M.). Since many of the older slides were of suboptimal quality and ER and PR were reported using scores no longer included in national guidelines (e.g. Allred score, different cut-offs) we found that all slides before 2001 should be restained and -analyzed, but all ER- and PR-stained slides were evaluated by the breast pathologist in the study. We have now included a line about this in the description of the clinical database on p. 8, line 162-163: «All histopathological data, including receptor status, were re-evaluated by a breast pathologist in the study.»

Page 14 discussion: “Molecular markers in tumors have been analyzed for the entire cohort, whereas a core of approximately 100 samples is approaching the full systems epidemiology potential.”

Where can this number be found back in the text or tables? Figure 3 and tables 1-3 suggest that all data is available for almost all breast cancer cases. Does this only refer to the part on which there is also omics data? I strongly suggest including a table with omics information in the results section (not necessarily results of analyses, but at least showing which data is available for how many samples). It would also be informative to show how data availability is distributed over core breast cancer subgroups e.g. pre- versus post-menopausal and subtypes (at least ER positive versus negative).

We appreciate the reviewer’s comment. Please note as illustrated in Fig. 3 that information from national registries, questionnaires and medical records and collected tumor tissue and blood samples are available for all breast cancer cases included in the CAMO cohort. As stated in Material and methods, section on Analyses of blood samples, a series of omics analyses have already been carried out on a subset of these samples. This illustrates that the collected material available for the entire CAMO-cohort is suitable for omics analyses. However, to clarify the number of samples for which omics data already is available, we have included this information in line 191 and given a short summary in lines 221-223. Further, the breast cancer cases included in the pilot study have been described in great detail, including information on breast cancer subgroups etc., in the published article referred to in the present manuscript (ref. 16). We have rephrased lines 309-310 in the Discussion to clarify that information on these 100 samples can be found in ref. 16. In this cohort profile manuscript, our emphasis is on the CAMO cohort as a whole and its’ potential.

“The discussion is overall quite long, would consider reducing the parts on own previous results and on multiple dimensions. Related to previous results, a burning question would be how the CAMO dataset can help with the next steps, this is not really addressed.

The data collection is very unique, but still the sample set is small for multilevel analyses. Perhaps the discussion should be tuned down a bit on the promises for the future.”

Response: Since this is a Cohort-profile paper, and the data material in the CAMO cohort has already been used in other research project, we wanted to describe the previous research thoroughly in the discussion section of the manuscript. In line 325 to 335 we discuss the possible future use of the data material in the CAMO-cohort.

We appreciate the reviewer’s suggestion on toning down the discussion on future use of the CAMO-cohort. However, regarding the sample size, we believe further studies using the CAMO cohort can generate hypothesis that can be tested in larger cohorts.

One of the most important pieces of information, which is basically table 2 only appears late in the discussion (p17 row 350-356), suggest moving this earlier in the discussion.

Response: We also address the information in Table 2 in the lines 254-269 under Results.

“In CAMO, serial sampling of biological specimens was not carried out. To account for this, we have approached the time issue using group level data, exploiting the randomness of the length of follow-up time due to the study design of the NOWAC Post genome cohort.”

Indeed, the lack of serial sampling may be mentioned as a shortcoming. However, the current manuscript does not provide any support or evidence in the shown tables or figures that this short coming is covered by the study design and sampling over a longer period. Frankly I would debate that this is possible unless the study would be very large and the distribution of patient and tumor characteristics well balanced over time between samples and diagnosis of cancer.

Response: We appreciate the comment from the reviewer and recognize this shortcoming in our data. However, we have previously demonstrated that the samples can be used to investigate time trends. This was shown in ref. 18 and 20, which are part of our discussion section highlighting some of the potential uses of the data. In the provided references, assumptions and shortcomings of this design are also discussed.

“The NOWAC study was previously externally validated (28).” Please explain better, not clear what is meant here.

Response: It is important to examine the external validity i.e. the possibility to make inferences to the general population outside the study sample. In the study that we refer to the authors investigated three different methodological aspects of external validity regarding NOWAC; linkage to national registries, inquiry to non-responders and comparison between observed and expected cancer incidence rates. The authors conclude that the analysis revealed no major source of selection bias and that NOWAC cohort is representative of women living in Norway.

We have included the phrase “… by Lund et al. 2003, revealing no major source of selection bias” on p. 17 line 367-368 to make this clearer.

Appendix. Correct typo page 1 “as as”

Response: We have corrected the typo on page 1 in the Appendix.

The appendix explains the scoring of the TMAs but not of the original and newly stained whole slides for ER, PR, HER2.

Response: The scoring of ER, PR and HER2 is explained in the section on the clinical database, p. 8.

This sentence in the Appendix “Each researcher gives three scores to each core, reflecting staining density and/or intensity in stroma, tumor cytoplasm and tumor nucleus.” needs some rewriting. I guess not all markers stain in all areas (stroma, cytoplasm, nucleas) and is always either density (is this % of cells?) or intensity (is this vague to strong coloring?) scored?

Response: We appreciate the reviewer’s comment and have rewritten this paragraph to make it clearer, «Typically, each researcher scores the staining density, i.e. number or percentage of positive cells, and/or intensity…».

“A better explanation of how the data can be requested, even if restrictions apply, should be included in the data availability statement”.

Response: We have included the following paragraph in the data availability statement: “Due to the sensitivity of the data that has been collected in the CAMO cohort, both from the data material collected in NOWAC and the medical records as well as from the linkage to the Cancer Registry, the data cannot be placed in a public repository. Data cannot be shared publicly since the data is of a detailed and sensitive nature. However, data will be available upon request to NOWAC at nowac@uit.no

Reviewer 4

“The authors present a cohort study, CAMO, which sounds excellent. CAMO is a substudy of a larger national cohort of individuals; these individuals developed breast cancer and were tracked over a period of time. There was a baseline questionnaire and for most patients 2 followup questionaires regarding various quality of life metrics. There were blood samples taken, sometime before and sometimes after the patients were diagnosed with breast cancer. The blood samples have been used for various molecular studies including mRNA and miRNA profiling and metabolomics. This sounds a great resource. Methods are described for all the profiling studies that have taken place. The paper describes the cohort (demographics, clinical/pathological features), and compared to the larger cohort, with few notable differences or findings.

I was looking forward to a description of the longitudinal quality of life data or the molecular analysis of blood samples or tissue samples, or the attempt at ‘systems epidemiological analysis’ but this was described as published data in the discussion or indicated as the potential for such a resource. I understand the value of describing a cohort and its potential for answering research questions before research has been done, but this seems the wrong way around. So I am puzzled as to why this is being submitted for publication at this point. Might be better to incorporate some data of interest to the readership (QoL data or tumour profiling studies that are in progress). Might be better submitted to a Biorepositories focused journal.”

Response: We appreciate the reviewer’s comment. Cohort profiles are articles that describe ongoing research cohorts, and, in brief, cohort profiles will describe large, collaborative prospective studies that identify a group of participants and follow them for long periods.

In contrast to research papers, that are traditional results papers and should address a specific research question, the reason to publish cohort profile papers is to provide information on a cohort’s establishment. The information that is published in cohort profile papers goes beyond what can reasonably be described in the methods section of a research paper. Another reason to publish a cohort profile paper is to advise other researchers of existing datasets and opportunities for collaboration.

We think that the advantage to publish our Cohort profile paper to PLOS ONE is so that anyone interests can easily access it when appraising studies that arise from it. The data material in the CAMO cohort has already been used in other research project, hence why we describe the research in the discussion section of the manuscript.

Attachment

Submitted filename: Response to Reviewers.docx

Decision Letter 1

Alvaro Galli

18 Jan 2023

Cohort profile: The Clinical and Multi-omic (CAMO) cohort, part of the Norwegian Women and Cancer (NOWAC) study

PONE-D-22-25688R1

Dear Dr. Delgado,

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

Alvaro Galli

Academic Editor

PLOS ONE

Additional Editor Comments (optional):

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #3: All comments have been addressed

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #3: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #3: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #3: No

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #3: Yes

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #3: The authors have sufficiently addressed the feedback. Personally I would have reshaped the discussion a bit more and made some further attempts to clarity the data (e.g. being clearer on how many women had a previous cancer or not, and adding a footnote to figure 2 explaining that the number of dots is less due to rounding), but the difference in viewpoint of what should and should not be included in this manuscript does in my view not hamper publication of the manuscript.

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #3: No

**********

Acceptance letter

Alvaro Galli

26 Jan 2023

PONE-D-22-25688R1

Cohort profile: The Clinical and Multi-omic (CAMO) cohort, part of the Norwegian Women and Cancer (NOWAC) study

Dear Dr. Delgado:

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.

If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org.

If we can help with anything else, please email us at plosone@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Dr. Alvaro Galli

Academic Editor

PLOS ONE

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 Appendix. Analyses of blood samples and tumor tissue.

    (DOCX)

    Attachment

    Submitted filename: Response to Reviewers.docx

    Data Availability Statement

    The questionnaire and registry data are stored and managed by the NOWAC research group at the Department of Community Medicine, UiT The Arctic University of Norway, Tromsø, Norway. Blood samples collected as part of the NOWAC Post-genome cohort (plasma, buffy coat, and PAXgene Blood RNA samples) are kept at the UiT Core Facility for Biobanking. The clinical sample material and the clinical database are kept and managed by the Department of Clinical Pathology, University Hospital of North Norway, Tromsø, Norway. Due to the sensitivity of the data that has been collected in the CAMO cohort, both from the data material collected in NOWAC and the medical records as well as from the linkage to the Cancer Registry, the data cannot be placed in a public repository. Data cannot be shared publicly since the data is of a detailed and sensitive nature. However, data will be available upon request to NOWAC at nowac@uit.no.


    Articles from PLOS ONE are provided here courtesy of PLOS

    RESOURCES