Abstract
The National Institute of Standards and Technology (NIST) Standard Reference Materials 2373 is a set of genomic DNA samples prepared from five breast cancer cell lines with certified values for the ratio of the HER2 gene copy number to the copy numbers of reference genes determined by real-time quantitative PCR and digital PCR. Targeted-amplicon, whole-exome, and whole-genome sequencing measurements were used with the reference material to compare the performance of both the laboratory steps and the bioinformatic approaches of the different methods using a range of amplification ratios. Although good reproducibility was observed in each next-generation sequencing method, slightly different HER2 copy numbers associated with platform-specific biases were obtained. This study clearly demonstrates the value of Standard Reference Materials 2373 as reference material and as a calibrator for evaluating assay performance as well as for increasing confidence in reporting HER2 amplification for clinical applications.
High levels of the expression of the human epidermal growth factor receptor (HER)-2 protein, due to amplification of the HER2 gene (ERBB2), frequently occurs in breast cancers1, 2 and gastrointestinal cancers.3, 4 The classic methods of measurement of HER2 are immunohistochemistry analysis for protein expression and fluorescence in situ hybridization techniques for gene amplification.5 The accurate measurements of HER2 amplification levels are important for determining the proper treatment using anti-HER2 therapeutics.4, 6, 7
There is a good correlation between the HER2 gene copy numbers and HER2 protein expression using immunohistochemistry analysis and fluorescence in situ hybridization methods when the measurements are carefully performed in clinical breast cancer samples8 and also in breast cancer cell lines.9 An analysis of individual cells from breast cancer cell lines showed a good correlation between a high copy number of chromosome 17 (where the HER2 gene is located) and high HER2 protein expression.9
Recently, real-time quantitative PCR (qPCR) was shown to be a sensitive method of confirming HER2 gene amplification in clinical samples in which immunohistochemistry analytical methods have failed.10 There is considerable interest in measuring the amplification of the HER2 gene along with other gene targets in cancer, using quantitative nucleic acid measurement techniques, especially those methods that can screen for panels of mutations to tailor treatments for individual patients.11
Next-generation sequencing (NGS) provides a powerful tool to detect multiple genetic alterations in a quantitative manner. Targeted-amplicon and whole-exome sequencing (WES) technologies for mutation measurements in cancer are becoming more widespread in clinical laboratories. Those NGS assays sequence targeted regions with clinically or biologically relevant genetic loci with high read depth coverage. Therefore, the assay can achieve high sensitivity for detecting somatic mutations in cancer. However, clinically relevant copy number variations (CNVs) are an essential measurement. NGS has shown great promise, and advances in computation methods have occurred,12, 13, 14 but these methods have not yet been fully validated using reference materials to confirm the calculations. Recently, the use of whole-genome sequencing (WGS) (at low coverage levels), along with recent advances in computation methods, has allowed for the calculation of CNVs using NGS.13 Guidelines on the quality management of NGS in clinical applications have been proposed, including test validation, quality-control procedures, proficiency testing, and the use of reference materials.15 NGS assays intended for clinical oncology applications have been validated using pooled cancer cell lines and clinical samples.16, 17
Reference materials are well-characterized samples that are ensured to be homogenous and stable and to accurately reflect the intended analyte. They can be used to ensure that measurement methods are working correctly, to calibrate instruments, to evaluate assay performance, to identify the weaknesses of the measurement process, and to assign values to other materials.18 Reference materials are used to improve the confidence and reliability of measurement methods by providing samples that are ensured to have the certified properties for their intended purpose. There are significant challenges in producing reference materials from biological materials, especially cell lines that require a high level of characterization, because of differences in the genomic materials due to the genetic drift that occurs during cell culture.
Currently, reference materials for cancer measurements are very limited. A World Health Organization international standard for the BCR-ABL mRNA levels consists of mixtures of cell lines to obtain different levels of the gene fusion mRNA target.19 The Genetic Testing Reference Materials coordination program (Centers for Disease Control and Prevention, http://www.cdc.gov/clia/Resources/GetRM, last accessed July 8, 2016) publishes consensus data from interlaboratory measurements on cell lines obtained from cell repositories.20, 21, 22 Due to the lack of reference materials, many laboratories produce their own control materials derived from cell lines or limited patient samples on an ad hoc basis. However, these ad hoc materials are not characterized fully, and stability and homogeneity may not have been determined.
In this report, we show the utility of the National Institute of Standards and Technology's (NIST) Standard Reference Materials (SRM) 2373 to ensure the quality and improve the confidence of the measurement of HER2 gene amplification using different NGS methods.
Materials and Methods
Preparation of Cell Lines
Details regarding the certification of NIST SRM 2373 are available from the certificate of analysis and an article describing the development and characterization of SRM 2373.23 The cell lines were obtained from the American Type Culture Collection (ATCC) (Manassas, VA) and grown in the NIST laboratories using the recommended culture conditions. The breast cancer cell lines have different sources, hormone sensitivity,24 and karyotypes25, 26, 27, 28 (Table 1). The HapMap cell lines CEPH NA12878 (characterized by the Genome in The Bottle consortium29), Chinese Female (NA18526), and Yoruban (NA18507) were used as reference genomes for the NGS methods. All three HapMap cell lines were obtained from the Coriell Institute for Medical Research (Camden, NJ) and cultured using the recommended culture conditions.
Table 1.
Information for the Five Cell Lines Used in SRM 2373
| Cell line | MDA-MB-231 component B | MDA-MB-361 component C | MDA-MB-453 component D | BT474 component E | SK-BR-3 component A |
|---|---|---|---|---|---|
| Source | Pleural effusion | Brain metastasis | Pericardial effusion | Invasive ductal carcinoma | Pleural effusion metastasis |
| Karyotype | Hypotriploid26 | Hyperdiploid25 | Tetraploid27, 28 | Hypertriploid25 | Hypertriploid25 (highly modified) |
| Type | Basal24 | Luminal A24 | Luminal24 | Luminal A24 | Luminal B24 |
| Hormone status | Triple negative24 | ER positive24 | Negative ER/PR24 | Triple Positive24 | Negative ER24 |
| NIST-certified ratio | 1.3 | 6.4 | 2.9 | 17.7 | 9.7 |
| 95% uncertainty interval | 1.1–1.5 | 5.7–7.1 | 2.6–3.2 | 15.9–19.5 | 8.7–10.7 |
ER, estrogen receptor status; NIST, National Institute of Standards and Technology; PR, progesterone status; SRM, Standard Reference Material.
DNA Extraction, Identity Authentication, and Reference Material Preparation
Genomic DNA samples were prepared using the Quick-gDNA MidiPrep kit (catalog number D3100; Zymo Research, Irvine, CA). The DNA samples were treated with bovine pancreatic ribonuclease A before purification to remove the enzyme.23 All purified genomic DNA samples were prepared in buffer (10 mmol/L Tris, 0.1 mmol/L EDTA, pH 8.0) and stored at 4°C. The cell line DNA was authenticated using the AmpFLSTR Identifiler Plus PCR Amplification Kit (catalog number 4427368; Life Technologies, Carlsbad, CA) on a 3500xl Genetic Analyzer with a 36-cm capillary array and POP-4 polymer (Life Technologies). The DNA samples were analyzed both when received from the repository and when the cells were expanded to produce the genomic DNA used for the production of SRM 2373.23 The short tandem repeat markers matched the values documented by ATCC. Cell lines were checked for myoplasma contamination using a PCR kit (Universal Mycoplasma Detection Kit 30-1012K; ATCC). An approximate concentration of the DNA components was determined using absorbance at 260 nm on five replicates (using the value 1 absorbance unit at 260 nm = 50 μg/mL).23
qPCR and dPCR at NIST
TaqMan assays for HER2 and the four reference genes were developed at NIST23 and validated using the MIME30 and dMIME31 guidelines. The primers and gene locations are shown in Table 2, and the probes, in Table 3. Black hole quencher 1– and fluorescein amidite–labeled probes were obtained from LGC Biosearch Technologies (Novato, CA). The qPCR measurements were taken using SYBR Green to determine the quantitation cycle using the ABI 7500 PCR System (Life Technologies). A quantitative human genomic DNA standard, SRM 2372 component A (produced from white blood cells from a healthy human male donor), was used to calibrate the standard curves for HER2 and the reference genes.23 The chamber dPCR reactions were run on a BioMark platform (Fluidigm, San Francisco, CA) and the droplet dPCR assays were performed using a Q×100 Droplet Digital PCR System (Bio-Rad, Hercules, CA).
Table 2.
PCR Primer Information for Reference Genes and HER2 Assays Used by MoCha
| Primer name | Sequence | PCR amplicon (bp) | Gene name | Location (GRCh37/hq19 nucleotide number) |
|---|---|---|---|---|
| HER2-2F | 5′-CTCATCGCTCACAACCAAGT-3′ | 112 | HER2 (17q12) | Exon 7 (chr17:37864601-37864620) |
| HER2-2R | 5′-GGTCTCCATTGTCTAGCACG-3′ | (chr17:37864693-37864712) | ||
| EIF5-F | 5′-GGCCGATAAATTTTTGGAAATG-3′ | 112 | EIF5B 2q11.2 | Intron 1 (chr2:99974140-99974161) |
| EIF5-R | 5′-GGAGTATCCCCAAAGGCATCT-3′ | (chr2:99974231-99974251) | ||
| 2PR4-F | 5′-CGGGTTTGGGTTCAGGTCTT-3′ | 97 | RPS27A 2p16 | Intron 4 (chr2:55462316-55462335) |
| 2PR4-R | 5′-TGCTACAATGAAAACATTCAGAAGTCT-3′ | (chr2:55462386-55462412) | ||
| R4Q5-F | 5′-CTCAGAAAAATGGTGGGAATGTT-3′ | 122 | DCK 4q13.3-q21.1 | Exon 3 (chr4:71888097-71888119) |
| R4Q5-R | 5′-GCCATTCAGAGAGGCAAGCT-3′ | (chr4:71888199-71888218) | ||
| 22C3-F | 5′-AGGTCTGGTGGCTTCTCCAAT-3′ | 78 | PMM1 22q13.2 | Intron 7 (chr22:41973739-41973759) |
| 22C3-R | 5′-CCCCTAAGAGGTCTGTTGTGTTG-3′ | (chr22:41973682-41973704) |
MoCha, Molecular Characterization and Clinical Assay Development Laboratory at the Frederick National Laboratory for Cancer Research (Frederick, MD).
Table 3.
TaqMan Fluorescent Probe Sequences for HER2 and Reference Gene Assays Used by MoCha
| Probe name | Sequence | 5′ Label | 3′ Quencher |
|---|---|---|---|
| HER2-2 (BHQ) | 5′-ACCCAGCTCTTTGAGGACAACTATGC-3′ | FAM | BHQ-1 |
| EIF5-P | 5′-TTCAGCCTTCTCTTCTCATGCAGTTGTCAG-3′ | FAM | BHQ-1 |
| 2PR4-P | 5′-TTTGTCTACCACTTGCAAAGCTGGCCTTT-3′ | FAM | BHQ-1 |
| R4Q5-P | 5′-CCTTCCAAACATATGCCTGTCTCAGTCGA-3′ | FAM | BHQ-1 |
| 22C3-P | 5′-CAAATCACCTGAGGTCAAGGCCAGAACA-3′ | FAM | BHQ-1 |
BHQ, black hole quencher; FAM, fluorescein amidite; MoCha, Molecular Characterization and Clinical Assay Development Laboratory at the Frederick National Laboratory for Cancer Research (Frederick, MD).
dPCR at MoCha
The SRM2373 material was further characterized by digital PCR (dPCR) at the Molecular Characterization and Clinical Assay Development Laboratory (MoCha), Frederick National Laboratory for Cancer Research (Frederick, MD). The dPCR assays used the PCR primers and TaqMan probes shown in Tables 2 and 3. The dPCR assays at MoCha were performed using a QX200 Droplet Digital PCR System (Bio-Rad). NIST SRM 2373 components A to E and CEPH NA12878 DNA samples were diluted to approximately 5 ng/mL using PCR-grade water. Additionally, components A and E DNA samples were diluted to 1 ng/μL using PCR-grade water. dPCR reactions (25 μL) were prepared in triplicate for each DNA sample/gene combination. The dPCR reactions consisted of 2× dPCR Supermix for Probes (no 2′-deoxyuridine 5′-triphosphate) (Bio-Rad), 900 nmol/L final concentration forward and reverse primers, 250 nmol/L probe, and 4 μL of the prepared DNA dilutions. DNA templates (20 ng) were loaded for all reactions except HER2 detection of components A and E. For these reactions, 4 ng of DNA was loaded, due to high levels of HER2 amplification. The dPCR reaction plates were placed on an Automated Droplet Generator (Bio-Rad) on which droplets were generated and then transferred to a 96-well PCR plate. The plate was heat-sealed with a foil seal and placed on a C1000 Touch Thermal Cycler (Bio-Rad). Amplification was performed as follows: 95°C for 10 minutes, 40 cycles at 94°C for 30 seconds, 60°C for 60 seconds, 98°C for 10 minutes, and 4°C hold. On completion of amplification, droplets were analyzed on a QX200 Droplet Reader (Bio-Rad) using the Abs experiment setting. Data were analyzed using QuantaSoft software version 1.7.4.0917 (Bio-Rad). The number of copies per microliter from each sample was divided by the total dilution factor (0.04 for reactions with 20 ng input, 0.008 for reactions with 4 ng total input) to yield the copies per microliter for each stock sample. Measurements of each gene were performed on triplicate samples for each component of SRM 2373. The ratios of HER2 to the reference genes were calculated by dividing the overall mean of the copy numbers from the four reference genes into the individual HER2 assay copy number measurements for each sample. All of the ratios were multiplied by 2, for comparison to the normal diploid copy number in normal cells.
NGS Assays
Three different NGS-based assays were used to characterize the SRM 2373 materials at the MoCha laboratory.
Exome Sequencing
For each sample, a total of 500 ng of genomic DNA, quantified by Qubit (Thermo Fisher Scientific, Waltham, MA), was sheared to 150 to 200 bp by Covaris E220 sonication (Covaris, Woburn, MA). After cleanup with AMPure XP beads (Beckman Coulter, Brea, CA), samples were checked for correct size distribution using the 2100 Bioanalyzer system (Agilent Technologies, Santa Clara, CA). Fragmented genomic DNA samples were processed with end-repair, dA addition, ligation of sequencing adaptors, and two rounds of six-cycle preamplification using the SureSelectXT Target Enrichment System (Agilent Technologies) for the Illumina Paired-End Sequencing Library construction kit (Illumina, San Diego, CA). A total of 500 ng of amplified DNA was hybridized with a biotinylated RNA bait set (SureSelectXT Human All Exon V5; Agilent Technologies) at 65°C for 24 hours. The captured genomic DNA fragments were enriched by Dynal MyOne Streptavidin T1 beads (Thermo Fisher) and amplified with barcoded index-attached primers for 12 cycles. The AMPure XP–purified libraries were checked for size distribution (300 to 400 bp) using an Agilent Bioanalyzer (Agilent Technologies) and quantified using a Library Quantification Kit (Kapa Biosystems, Wilmington, MA). A pooled library made by two final libraries mixed at equal molar ratio was clustered at 20 pM per flowcell lane using the Illumina cBot system before sequencing on an Illumina HiSeq 2500 platform (Illumina). Sequencing reactions were run using 2 × 125 paired-end mode. Demultiplexed FASTQ files were generated with Casava software version 1.8.2 configuration bcl2fastq.pl (Illumina) from the .bcl files. The multiple FASTQ files generated by this script were concatenated and primer-trimmed using the ea-utils fastq-mcf tool with the options “–l 30 –q 10 –u –P 33” to remove Illumina PCR and sequencing primers from the sequences. The trimmed sequences were mapped to human genome reference hg19 using the Burrows-Wheeler Aligner software version 0.6.2 aln32 and sample mode in default settings. The resulting SAM files were converted to BAM format, sorted, de-duplicated, and indexed using SAMtools and Picard.33 Three algorithms—cn.MOPS,34 CONTRA,35 and ExomeCNV36—were applied on the exome data to detect the CNVs. In brief, cn.MOPS uses local modeling of the depth of coverage to minimize false-positive CNV calling; similarly, CONTRA uses base-level log-ratio of coverage between tumor and control to infer the gain or loss of each region. ExomeCNV is based on data observed from six human samples captured using a single-exome capture platform, and involved the modeling of log-ratios using the Geary-Hinkley transformation, for which a normally distributed exon-level depth of coverage is assumed. Seven replicates of WES data generated from HapMap CEPH 12878 were used as a reference to call CNVs in SRM 2373 cell lines.
Whole-Genome Sequencing
One microgram of DNA from each sample was used for library preparation for WGS. The TruSeq DNA PCR-Free Library Preparation Kit (catalog number FC-121-3001; Illumina) was used for sequencing library preparation according to the vendor's recommended method (TruSeq Library Prep guide; Illumina). Sequencing libraries were quantified using a Library Quantification Kit (Kapa Biosystems). Subsequently, sequencing libraries from three samples were pooled at equal molar ratios and requantified using the same Library Quantification Kit and were clustered at 20 pM per flowcell lane using the Illumina cBot before sequencing on an Illumina HiSeq 2500 platform (Illumina). A total of four replicates of WGS assay were performed for each SRM cell line in two batches: One replicate was performed in the first batch, and other (total of four) replicates were performed in the second batch. Sequencing reactions were run using 2 × 125 paired-end mode. FASTQ data generated using the bcl2fastq tool version 1.84 (Illumina) were run through FastQC (Babraham Bioinformatics, Cambridge, UK) for quality checking, and any adaptor contamination and low-quality bases found in reads were removed later by Trimmomatic.37 The processed reads were mapped to the human hg19 reference genome using NovoAlignMPI (default parameters; Novocraft Technologies, Selangor, Malaysia) on Biowulf cluster (CIT/NIH). The resulting BAM files were sorted and indexed using Picard.2 CNVnator38 was then applied on these WGS data to detect copy number changes. This algorithm uses the established mean-shift approach,39 with additional corrections for multiple-bandwidth partitioning and GC correction for more accurate CNV detection. For the given HER2 gene, CNVnator was used to calculate the copy number of the whole length of the gene by normalizing the local read depth signal to the genomic average for the region of the same length.
Oncomine Targeted-Amplicon Sequencing Assay
The Oncomine Cancer Panel NGS assay40 (Thermo Fisher) was used to determine the HER2 copy number. Two PCR reactions each using 10 ng of DNA were prepared, one for each primer pool in the Oncomine Cancer Research Panel, using the Ion AmpliSeq Library Kit version 2.0 (Thermo Fisher). The primer pools were combined after amplification and treated with FuPa reagent, and a separate Ion Xpress barcode (Thermo Fisher) was ligated to each sample library. The resulting libraries were purified and then quantified using the Ion Library Quantitation Kit (Thermo Fisher). The Ion PGM Template OT2 200 Kit and Ion OneTouch 2 System (Thermo Fisher) were used to prepare templates for DNA libraries, followed by enrichment on the Ion OneTouch ES. The Ion Sequencing 200 Kit version 2, Ion 318 Chip Kit version 2, and Ion Torrent PGM System (Thermo Fisher) were used to sequence the template libraries. All procedures were performed following the manufacturer's instructions. Completed runs were reviewed for quality based on thresholds for the number of reads (≥3 million), read length (≥75 bp), uniformity (≥80%), 100× amplicon coverage (≥90%), and median absolute pairwise difference values (<0.9). The uniformity and amplicon coverage were determined using the Coverage Analysis plug-in (Thermo Fisher). Data analysis was performed using Torrent Suite software version 4.4.2 and Ion Reporter software version 4.4.2 (Thermo Fisher). Copy number analysis was performed using the Copy Number module within the Oncomine Cancer Research Panel workflow within the Ion Reporter system. To generate a baseline for CNV calling, nine replicates of previously generated Oncomine Cancer Panel HapMap normal values (3× CEPH NA12878, 3× Chinese Female NA18526, and 3× Yoruban NA18507) were imported onto the Ion Reporter server, and the Copy Number Baseline module was run.
Results
The HER2 reference material is composed of genomic DNA extracted from these well-characterized cell lines: SK-BR-3 (component A), MDA-MB-231 (component B), MDA-MB-361 (component C), MDA-MB-453 (component D), and BT-474 (component E) (Table 1). The cell line used for component B is classified as HER2 amplification–negative, and the other cell lines have moderate to high levels of HER2 gene amplification.
To test cross-laboratory performance, the NIST dPCR assay methods and reagents were transferred to the MoCha laboratory. The NIST dPCR assays were performed there using different instruments and operators. Figure 1 shows the correlation of the MoCha values to the NIST-certified values. Briefly, dPCR was used to measure HER2 and the four reference genes for each of the samples; individual assays for each sample were performed in triplicate. The ratios were calculated by dividing the combined mean copy numbers of either all four reference genes (PMM1, RPS27A, DCK, and EIF5B) or of three reference genes (RPS27A, DCK, and EIF5B) into the individual HER2 copy numbers for each component. The exclusion of PMM1 improved the correlation of component D, because the PMM1 dPCR values for that component were approximately 70% higher compared to the other reference genes. The presumptive amplification of PMM1 in component D was confirmed by the NGS data. Overall, the dPCR copy numbers showed excellent agreement with the NIST-certified values (Figure 1). The NIST-certified values were calculated using three reference genes, of which the reference gene PMM1 was used for qPCR measurements of components B and C only.23
Figure 1.
Correlation of the droplet digital PCR measurements of Standard Reference Material 2373 components by the Molecular Characterization and Clinical Assay Development Laboratory (Frederick National Laboratory for Cancer Research, Frederick, MD) with the National Institute of Standards and Technology (NIST)-certified values. The NIST-certified ratios and 95% CIs (horizontal bars) values (plotted on the x axis) were multiplied by 2. The digital PCR (dPCR) experimental values (plotted on the y axis) for the five NIST components were performed in triplicate for HER2 and the four reference genes. The dPCR copy number results were calculated by dividing the mean dPCR values from either four reference genes (PMM1, RPS27A, DCK, and EIF5B) or three reference genes (RPS27A, DCK, and EIF5B s) into the individual HER2 copy number values.
Because the intended use of reference materials is to improve confidence and reliability in measurements, we analyzed the reference materials on several different NGS platforms. Figure 2 shows the results of using different sequencing methods to measure copy numbers. Figure 2A shows the gene map of HER2 along with the locations of the dPCR amplicon (used in this study), the targeted amplicons used in the Oncomine Cancer Panel, and the target regions of probes used for WES. WGS assesses the entire HER2 gene, unlike the other methods, which interrogate only a portion of the gene.
Figure 2.
Comparison of HER2 gene amplifications, measured by various next-generation sequencing assays. A: Diagram of the digital PCR amplicon, the Oncomine (Thermo Fisher Scientific, Waltham, MA) targeted panel amplicons, the whole-exome baits, and the RefSeq map of HER2. B: The National Institute of Standards and Technology (NIST)–certified ratios and 95% CIs (vertical bars) plotted were multiplied by 2. HER2 copy number values were calculated from Oncomine Cancer Amplicon panel. Whole-genome values were calculated using whole-genome sequencing (WGS; CNVnator38); whole-exome sequencing (WES) data were calculated by either the CONTRA algorithm35 (exome CONTRA), ExomeCNV algorithm36 (exome exome), or the cn.MOPS algorithm34 (Exome cn.MOPS). Data are expressed as means ± SD. n = 3 (WES); n = 4 (WGS).
Figure 2B shows the comparison of three NGS methods with the NIST-certified values. We observe that the HER2 copy numbers called by these NGS methods roughly align with the certified values in the range of low to moderate amplifications. Reproducibility was observed among multiple replicates within each of three NGS methods of sequencing (amplicon, WES, and WGS). The HER2 copy numbers determined from WES data by the three CNV calling algorithms were comparable, but showed some differences. Interestingly, all NGS methods underestimated the high amplification level in component E (copy number >30). Copy number measured by targeted-amplicon sequencing and WES methods seem to reach the plateau at a HER2 copy number of 20, whereas the WGS sequencing method shows a wider dynamic range.
The copy numbers of four reference genes measured by WES and WGS (Figure 3) aligned well with dPCR results too. However, the data indicated that the reference gene PMM1 appeared to be slightly amplified (approximately 50%) in component D, as shown on WGS and WES sequencing (for copy numbers calculated by the CONTRA and Exome algorithms, but not by the cn.MOPS algorithm).
Figure 3.
Comparison of the four reference genes for each component, calculated by next-generation sequencing–based methods. A–C: Experimental values for whole-exome sequencing (WES) data, calculated using either the cn.MOPS algorithm32 (A), the CONTRA algorithm33 (B), or the ExomeCNV algorithm34 (C). D: Reference genes for the whole-genome sequencing (WGS) data were calculated using CNVnator.36 Data are expressed as means ± SD. n = 3 (A–C); n = 4 (D).
Discussion
Certified reference materials are intended to provide a source of uniform and stable samples that can be used to calibrate measurement techniques to ensure that accurate results are being obtained. It is important to ensure accurate results to allow for the comparison of results from different methods, instruments, laboratories, and operators, as well as to ensure consistent measurements over time.
The NIST-certified values of the ratios of HER2 to the different reference genes were established by extensive measurements using qPCR and dPCR to determine the reference values and measurement confidence values. Multiple instruments, PCR assays, operators, and experiments were used to evaluate the confidence of the certified values. PCR assays were designed for different regions of HER2 (chromosome 17) and the four reference genes (located on chromosomes 2, 4, and 22).23 Assays were developed for the four reference genes, RP27A, DCK, EIF5B, and PMM1, with chromosomal locations of 2p11.2, 4q13.3–21.1, 2q11.2, and 22q13.2, respectively (Table 2)—locations that were not frequently found to exhibit CNVs in breast cancers, based on literature studies.24, 25, 41 NIST SRM 2372 component A (quantitative DNA reference material from a single male donor) was used to calibrate the qPCR measurements to determine the gene copy number of HER2 and the selected reference genes. NIST SRM 2372 component A and CEPH 12878 were used to confirm the dPCR measurements at the MoCha laboratory.
Our results support the importance of validating measurement methods and using multiple reference genes and not depending on only one reference gene. We evaluated several commercial reference genes and found them to be amplified in several of the components, as illustrated by the slight amplification of PMM1 in component D. The NIST-certified values were based on the use of three reference genes in the qPCR and dPCR measurements, and PMM1 was used for the component D calculations.23
With regard to reproducibility among the replicates of the dPCR, three NGS methods indicated good precision for each of these methods (Figure 2B). Comparison of three different CNV-calling algorithms using the same WES data also yielded similar HER2 copy number results. Similar bioinformatics efforts have been applied to WGS CNV-calling algorithms, such as RDXplorer,42 Wave CNV,43 CNV-seq,44 and ReadDepth.45 However, when we tried these alternate algorithms for our WGS data, no fruitful results were obtained due to the technical difficulties (ie, too long of a run time, missing dependencies) or the challenges of low coverage in these data sets. Given the fact that CNVnator was chosen in a large-scale study (ie, 1000 Genomes46), our experience supports that CNVnator is suitable for CNV detection using low-coverage WGS data.
Not surprisingly, each method demonstrated slightly different HER2 copy numbers, likely associated with platform-specific biases. Whole-exome data have been reported to be somewhat challenging for accurately measuring gene copy numbers,14 whereas WGS has been claimed to be a better method for copy number measurement even at low sequence coverage.13 Our results support that low-coverage (5× in this case) WGS has a wider dynamic range than do target-amplicon sequencing and WES for CNV detection. The deviations in the HER2 amplification results from NGS methods may have been affected by the difference in the locations of the interrogated regions (Figure 2A), the chemistry (eg, capture efficiency in WES and primer-annealing efficiency in target-amplicon sequencing), and the choice of data analysis pipelines.
The absolute copy number measurement using target-amplicon sequencing may not be accurate if the copy number of HER2 is >20. However, the copy number detection by target-amplicon sequencing shows very good accuracy, with certified values when the copy number was <20 (Figure 2B). The NCI-MATCH trial used the exact same target-amplicon sequencing assay, and the threshold copy number was set at 7, which is well within the accurate range as the threshold for drug treatment (patients with a copy number of 7 or above will be treated) (Lih CJ, Harrington RD, Sims DJ, et al., unpublished data). This notion indicates that a wider dynamic may not be the most essential concern with clinical assays; however, it is important to determine whether the performance of the assay meets the intended application.
These results clearly demonstrate the value of reference materials for calibrations to permit a uniform basis for comparing different methods. SRM 2373 provides such a uniform and well-characterized set of genomic samples to increase the confidence in HER2 amplification measurements and analytical techniques. We are in the process of developing additional reference materials for important copy number mutations for cancer measurements.
Acknowledgments
We thank Margaret Kline and David Duewer (NIST) for helpful advice on dPCR measurements.
Footnotes
Supported by internal funding from the National Institute of Standards and Technology and partially by the National Cancer Institute, NIH, contracts HHSN261200800001E and NO1-CO-2008-00001.
Disclosures: None declared.
This work neither expresses nor represents the opinions of the National Institute of Standards and Technology, the US Department of Commerce, the National Cancer Institute, the NIH, or the US Department of Health and Human Services. Certain commercial equipment, instruments, or materials are identified in this article to foster understanding. Such identification does not imply recommendation or endorsement by the National Institute of Standards and Technology, nor does it imply that the materials or equipment identified are necessarily the best available for the purpose.
References
- 1.Slamon D.J., Clark G.M., Wong Sg, Levin W.J., Ullrich A., McGuire W.L. Human breast cancer: correlation of relapse and survival with amplification of the HER-2/neu oncogene. Science. 1987;235:177–182. doi: 10.1126/science.3798106. [DOI] [PubMed] [Google Scholar]
- 2.Carlson R.W., Moench S.J., Hammond W.W., Perez E.A., Burstein H.J., Allred D.C., Vogel C.L., Goldstein L.J., Somlo G., Gradishar W.J., Hudis Wa, Jahanzeb M., Stark A., Wolff A.C., Press M.F., Winer E.P., Paik S., Ljung B.J. HER2 testing in breast cancer: NCCN task force report and recommendations. J Natl Compr Canc Netw. 2006;4:S1–S23. [PubMed] [Google Scholar]
- 3.Rüschoff J., Hanna W., Bilous M., Hofmann M., Osamura R.Y., Penault-Llorca F., van de Vijver M., Viale G. HER2 testing in gastric cancer: a practical approach. Mod Pathol. 2012;25:637–650. doi: 10.1038/modpathol.2011.198. [DOI] [PubMed] [Google Scholar]
- 4.Zhou Z., Hick D.G. HER2 amplification or overexpression in upper GI tract and breast cancer with clinical diagnosis and treatment. In: Siregar Y., editor. Oncogene and Cancer: From Bench to Clinic. InTech; Rijeka, Croatia: 2013. [Google Scholar]
- 5.Gown A.M., Goldstein L.C. The knowns and the unknowns in HER2 testing in breast cancer. Am J Clin Pathol. 2011;136:5–6. doi: 10.1309/AJCP53KBPMDRTYDK. [DOI] [PubMed] [Google Scholar]
- 6.Ross J.S., Slodkowska E.A., Symmans W.F., Pusztai L., Ravdin P.M., Hortobagyi G.N. The HER-2 receptor and breast cancer: ten years of targeted anti-HER-2 therapy and personalized medicine. Oncologist. 2009;14:320–368. doi: 10.1634/theoncologist.2008-0230. [DOI] [PubMed] [Google Scholar]
- 7.Valabrega G., Montemurro F., Aglietta M. Trastuzumab: mechanism of action, resistance and future perspectives in HER2-overexpressing breast cancer. Ann Oncol. 2007;18:977–984. doi: 10.1093/annonc/mdl475. [DOI] [PubMed] [Google Scholar]
- 8.Press M., Slamon D., Flom K., Park J., Zhou J., Bernstein L. Evaluation of HER-2/neu gene amplification and overexpression: comparison of frequently used assay methods in a molecularly characterized cohort of breast cancer specimens. J Clin Oncol. 2002;20:3095–3105. doi: 10.1200/JCO.2002.09.094. [DOI] [PubMed] [Google Scholar]
- 9.Szollosi J., Balazs M., Feuerstein B., Benz C., Waldman F. ERBB-2 (HER2/NEU) gene copy number, p185(HER-2) overexpression and intratumor heterogeneity in human breast cancer. Cancer Res. 1995;55:5400–5407. [PubMed] [Google Scholar]
- 10.Koudelakova V., Berkovcova J., Trojanec R., Vrbkova J., Radova L., Ehrmann J., Kolar Z., Melichar B., Hajduch M. Evaluation of HER2 gene status in breast cancer samples with indeterminate fluorescence in situ hybridization by quantitative real-time PCR. J Mol Diagn. 2015;17:446–455. doi: 10.1016/j.jmoldx.2015.03.007. [DOI] [PubMed] [Google Scholar]
- 11.Simon R., Roychowdhury S. Implementing personalized cancer genomics in clinical trials. Nat Rev Drug Discov. 2013;12:358–369. doi: 10.1038/nrd3979. [DOI] [PubMed] [Google Scholar]
- 12.Wang H., Nettleton D., Ying K. Copy number variation detection using next generation sequencing read counts. BMC Bioinformatics. 2014;15:109. doi: 10.1186/1471-2105-15-109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Pirooznia M., Goes F.S., Zandi P.P. Whole-genome CNV analysis: advances in computational approaches. Front Genet. 2015;6:138. doi: 10.3389/fgene.2015.00138. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Guo Y., Sheng Q., Samuels D.C., Lehmann B., Bauer J.A., Pietenpol J., Shyr Y. Comparative Study of Exome Copy Number Variation Estimation Tools Using Array Comparative Genomic Hybridization as Control. Biomed Res Int. 2013;2013:915636. doi: 10.1155/2013/915636. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Gargis A.S., Kalman L., Berry M.W., Bick D.P., Dimmock D.P., Hambuch T. Assuring the quality of next-generation sequencing in clinical laboratory practice. Nat Biotechnol. 2012;30:1033–1036. doi: 10.1038/nbt.2403. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Lih C.J., Sims D.J., Harrington R.D., Polley E.C., Zhao Y., Mehaffey M.G., Forbes T.D., Das B., Walsh W.D., Datta V., Harper K.N., Bouk C.H., Rubinstein L.V., Simon R.M., Conley B.A., Chen A.P., Kummar S., Doroshow J.H., Williams P.M. Analytical validation and application of a targeted next-generation sequencing mutation-detection assay for use in treatment assignment in the NCI-MPACT trial. J Mol Diagn. 2016;18:51–67. doi: 10.1016/j.jmoldx.2015.07.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Frampton G.M., Fichtenholtz A., Otto G.A., Wang K., Downing S.R., He J. Development and validation of a clinical cancer genomic profiling test based on massively parallel DNA sequencing. Nat Biotechnol. 2013;31:1023–1031. doi: 10.1038/nbt.2696. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.May W., Parris R., Beck C., Fassett J., Greenberg R., Guenther F., Kramer G., Wise S., Gillis T., Colbert J., Gettings R., MacDonald B. Standard Reference Materials®, definitions of terms and modes used at NIST for value-assignment of reference materials for chemical measurements. NIST Special Publication 260-136. 2000 [Google Scholar]
- 19.White H.E., Matejtschuk P., Rigsby P., Gabert J., Lin F., Wang Y.L., Branford S., Müller M.C., Beaufils N., Beillard E., Colomer D., Dvorakova D., Ehrencrona H., Goh H.G., El Housni H., Jones D., Kairisto V., Kamel-Reid S., Kim D.W., Langabeer S., Ma E.S.K., Press R.D., Romeo G., Wang L., Zoi K., Hughes T., Saglio G., Hochhaus A., Goldman J.M., Metcalfe P., Cross N.C. Establishment of the first World Health Organization International Genetic Reference Panel for quantitation of BCR-ABL mRNA. Blood. 2010;116:e111–e117. doi: 10.1182/blood-2010-06-291641. [DOI] [PubMed] [Google Scholar]
- 20.Kalman L.V., Amos Wilson J., Buller A., Dixon L., Edelmann L., Geller L., Highsmith W.E., Holtegaard L., Kornreich R., Rohlfs E.M., Payeur T., Sellers T., Muralidharan K. Characterization of genomic DNA reference materials for genetic testing of disorders common in people of Ashkenazi Jewish decent. J Mol Diagn. 2009;6:530–536. doi: 10.2353/jmoldx.2009.090050. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Barker S.D., Bale S., Buller A., Das S., Friedman K., Godwin A.K., Grody W., Highsmith E., Kant J., Lyon E., Mao R., Monaghan K.G., Payne D.A., Pratt V.M., Roa B., Schrijver I., Shrimpton A.E., Spector E., Telatar M., Weck K., Zehnbauer B., Booker J., Kalman L.V. Development and characterization of reference materials for MTHFR, SERPINA1, RET, BRCA1, and BRCA2 genetic testing. J Mol Diagn. 2009;11:553–561. doi: 10.2353/jmoldx.2009.090078. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Pratt V.M., Everts R.E., Aggarwal P., Beyer B.N., Broeckel U., Epstein-Baak R., Hujsak P., Kornreich R., Liao J., Lorier R., Scott S.A., Smith C.H., Toji L.H., Turner A., Kalman L.V. Characterization of 137 genomic DNA reference materials for 28 pharmacogenetic genes: a GeT-RM collaborative project. J Mol Diagn. 2015;18:109–123. doi: 10.1016/j.jmoldx.2015.08.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.He H.J., Almeida J.L., Lund S., Steffen C.R., Choquette S., Cole K.D. Development of NIST Standard Reference Material 2373: genomic DNA standards for HER2 measurements. Biomol Detect Quantif. 2016;8:1–8. doi: 10.1016/j.bdq.2016.02.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Neve R., Chin K., Fridlyand J., Yeh J., Baehner F., Fevr T., Clark L., Bayani N., Coppe J., Tong F., Speed T., Spellman P., DeVries S., Lapuk A., Wang N., Kuo W., Stilwell J., Pinkel D., Albertson D., Waldman F., McCormick F., Dickson R., Johnson M., Lippman M., Ethier S., Gazdar A., Gray J. A collection of breast cancer cell lines for the study of functionally distinct cancer subtypes. Cancer Cell. 2006;10:515–527. doi: 10.1016/j.ccr.2006.10.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Kytola S., Rummukainen J., Nordgren A., Karhu R., Farnebo F., Isola J., Larsson C. Chromosomal alterations in 15 breast cancer cell lines by comparative genomic hybridization and spectral karyotyping. Genes Chromosomes Cancer. 2000;28:308–317. doi: 10.1002/1098-2264(200007)28:3<308::aid-gcc9>3.0.co;2-b. [DOI] [PubMed] [Google Scholar]
- 26.Roschke A.V., Tonon G., Gehlhaus K.S., McTyre N., Bussey K.J., Lababidi S., Scudiero D.A., Weinstein J.N., Kirsch I.R. Karyotypic complexity of the NCI-60 drug-screening panel. Cancer Res. 2003;63:8634–8647. [PubMed] [Google Scholar]
- 27.Davidson J.M., Gorringe K.L., Chin S.F., Orsetti B., Besret C., Courtay-Cahen C., Roberts I., Theillet C., Caldas C., Edwards P.A. Molecular cytogenetic analysis of breast cancer cell lines. Br J Cancer. 2000;83:1309–1317. doi: 10.1054/bjoc.2000.1458. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Popovici C., Basset C., Bertucci F., Orsetti B., Adelaide J., Mozziconacci M.J., Conte N., Murati A., Ginestier C., Charafe-Jauffret E., Ethier S.P., Lafage-Pochitaloff M., Theillet C., Birnbaum D., Chaffanet M. Reciprocal translocations in breast tumor cell lines: cloning of a t(3;20) that targets the FHIT gene. Genes Chromosomes Cancer. 2002;35:204–218. doi: 10.1002/gcc.10107. [DOI] [PubMed] [Google Scholar]
- 29.Zook J.M., Chapman B., Wang J., Mittelman D., Hofmann O., Hide W., Salit M.L. Integrating human sequence data sets provides a resource of benchmark SNP and indel genotype calls. Nat Biotechnol. 2014;32:246–251. doi: 10.1038/nbt.2835. [DOI] [PubMed] [Google Scholar]
- 30.Bustin S.A., Benes V., Garson J.A., Hellemans J., Huggett J., Kubista M., Mueller R., Nolan T., Pfaffl M.W., Shipley G.L., Vandesompele J., Wittwer C.T. The MIQE Guidelines: minimum Information for publication of quantitative real-time PCR experiments. Clin Chem. 2009;55:611–622. doi: 10.1373/clinchem.2008.112797. [DOI] [PubMed] [Google Scholar]
- 31.Huggett J.F., Foy C.A., Benes V., Emslie K., Garson J.A., Haynes R., Hellemans J., Kubista M., Mueller R.D., Nolan T., Pfaffl M.W., Shipley G.L., Vandesompele J., Wittwer C.T., Bustin S.A. The digital MIQE guidelines: minimum Information for publication of quantitative digital PCR experiments. Clin Chem. 2013;59:892–902. doi: 10.1373/clinchem.2013.206375. [DOI] [PubMed] [Google Scholar]
- 32.Li H., Durbin R. Fast and accurate long-read alignment with Burrows-Wheeler transform. Bioinformatics. 2010;26:589–595. doi: 10.1093/bioinformatics/btp698. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Li H., Handsaker B., Wysoker A., Fennell T., Ruan J., Homer N., Marth G., Abecasis G., Durbin R., 1000 Genome Project Data Processing Subgroup GPDP: the sequence alignment/map format and SAMtools. Bioinformatics. 2009;25:2078–2079. doi: 10.1093/bioinformatics/btp352. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Klambauer G., Schwarzbauer K., Mayr A., Clevert D.A., Mitterecker A., Bodenhofer U., Hochreiter S. cn.MOPS: mixture of Poissons for discovering copy number variations in next-generation sequencing data with a low false discovery rate. Nucleic Acids Res. 2012;40:e69. doi: 10.1093/nar/gks003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Li J., Lupat R., Amarasinghe K.C., Thompson E.R., Doyle M.A., Ryland G.L., Tothill R.W., Halgamuge S.K., Campbell I.G., Gorringe K.L. CONTRA: copy number analysis for targeted resequencing. Bioinformatics. 2012;15:1307–1313. doi: 10.1093/bioinformatics/bts146. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Sathirapongsasuti J.F., Lee H., Horst B.A., Brunner G., Cochran A.J., Binder S., Quackenbush J., Nelson S.F. Exome sequencing-based copy-number variation and loss of heterozygosity detection: ExomeCNV. Bioinformatics. 2011;27:2648–2654. doi: 10.1093/bioinformatics/btr462. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Bolger A.M., Lohse M., Usadel B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics. 2014;30:2114–2120. doi: 10.1093/bioinformatics/btu170. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Abyzov A., Urban A.E., Snyder M., Gerstein M. CNVnator: an approach to discover, genotype, and characterize typical and atypical CNVs from family and population genome sequencing. Genome Res. 2011;21:974–984. doi: 10.1101/gr.114876.110. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Wang L.Y., Abyzov A., Korbel K.O., Snyder M., Gerstein M. MSB: a mean-shift-based approach for the analysis of structural variation in the genome. Genome Res. 2009;19:106–117. doi: 10.1101/gr.080069.108. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Hovelson D.H., McDaniel A.S., Cani A.K., Johnson B., Rhodes K., Williams P.D., Bandla S., Bien G., Choppa P., Hyland F., Gottimukkala R., Liu G., Manivannan M., Schageman J., Ballesteros-Villagrana E., Grasso C.S., Quist M.J., Yadati V., Amin A., Siddiqui J., Betz B.L., Knudsen K.E., Cooney K.A., Feng F.Y., Roh M.H., Nelson P.S., Liu C.J., Beer D.G., Wyngaard P., Chinnaiyan A.M., Sadis S., Rhodes D.R., Tomlins S.A. Development and validation of a scalable next-generation sequencing system for assessing relevant somatic variants in solid tumors. Neoplasia. 2015;17:385–399. doi: 10.1016/j.neo.2015.03.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Bergamaschi A., Kim Y.H., Wang P., Sorlie T., Hernandez-Boussard T., Lonning P.E., Tibshirani R., Borresen-Dale A.L., Pollack J.R. Distinct patterns of DNA copy number alteration are associated with different clinicopathological features and gene-expression subtypes of breast cancer. Genes Chromosomes Cancer. 2006;45:1033–1040. doi: 10.1002/gcc.20366. [DOI] [PubMed] [Google Scholar]
- 42.Yoon S., Xuan Z., Makarov V., Ye K., Sebat J. Sensitive and accurate detection of copy number variants using read depth of coverage. Genome Res. 2009;19:1586–1592. doi: 10.1101/gr.092981.109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Holt C., Losic B., Pai D., Zhao Z., Trinh Q., Syam S., Arshadi N., Jang G.N., Ali J., Beck T., McPherson J., Muthuswamy L.B. WaveCNV: allele-specific copy number alterations in primary tumors and xenograft models from next-generation sequencing. Bioinformatics. 2014;30:768–774. doi: 10.1093/bioinformatics/btt611. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Xie C., Tammi M.T. CNV-seq, a new method to detect copy number variation using high-throughput sequencing. BMC Bioinformatics. 2009;10:80. doi: 10.1186/1471-2105-10-80. 1-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Miller C.A., Hampton O., Coarfa C., Milosavljevic A. ReadDepth: a parallel R package for detecting copy number alterations from short sequencing reads. PLoS One. 2011;6:e16327. doi: 10.1371/journal.pone.0016327. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Mills R.E., Walter K., Stewart C., Handsaker R.E., Chen K., Alkan C., 1000 genomes Mapping copy number variation by population scale genome sequencing. Nature. 2011;470:59–65. doi: 10.1038/nature09708. [DOI] [PMC free article] [PubMed] [Google Scholar]



