Abstract
Somatic hypermutation status of the IGHV gene is essential for treating patients with chronic lymphocytic leukemia/small lymphocytic lymphoma. Unlike the conventional low-throughput method, assessment of somatic hypermutation by next-generation sequencing (NGS) has potential for uniformity and scalability. However, it lacks standardization or guidelines for routine clinical use. We critically assessed the performance of an amplicon-based NGS assay across 458 samples. Using a validation cohort (35 samples), the comparison of two platforms (Ion Torrent versus Illumina) and two primer sets [leader versus framework region 1 (FR1)] in their ability to identify clonotypic IGHV rearrangement(s) revealed 97% concordance. The mutation rates were identical by both platforms when using the same primer set (FR1), whereas a slight overestimation bias (+0.326%) was found when comparing FR1 with leader primers. However, for nearly all patients this did not affect the stratification into mutated or unmutated categories, suggesting that use of FR1 may provide comparable results if leader sequencing is not available and allowing for a simpler NGS laboratory workflow. In routine clinical practice (423 samples), the productive rearrangement was successfully detected by either primer set (leader, 97.7%; FR1, 94.7%), and a combination of both in problematic cases reduced the failure rate to 1.2%. Higher sensitivity of the NGS-based analysis also detected a higher frequency of double IGHV rearrangements (19.1%) compared with traditional approaches.
Chronic lymphocytic leukemia/small lymphocytic lymphoma (CLL) is a malignant neoplastic proliferation of small mature B cells that typically co-express CD5 and CD23, among other markers, and involve blood, bone marrow, and secondary lymphoid tissues.1 Clinical outcomes vary widely, with some patients having an indolent course without the need for therapy and others experiencing rapid disease progression and death despite treatment. Among several risk stratification systems, the somatic hypermutation (SHM) status of the rearranged IGHV gene in tumor cells has long been recognized as a robust independent prognosticator of disease aggressiveness.2,3 Unmutated IGHV, defined as a sequence with ≤2% difference (or ≥98% sequence homology) from the germline reference sequence, is associated with inferior clinical outcomes, including lower progression-free survival and overall survival, compared with the mutated counterpart.2, 3, 4 SHM status has also been associated with differences in response to therapy and thus serves as a predictive disease parameter.5, 6, 7, 8, 9
SHM is an important physiologic process integral to normal B-cell differentiation. Briefly, during B-cell maturation, somatic rearrangement of the immunoglobulin genes occurs by a process of V(D)J recombination accompanied by junctional diversity, which contributes to the uniqueness of the immunoglobulin molecule expressed on each B cell.10,11 The diversity of the immunoglobulin repertoire is further increased in mature B cells when they migrate through the germinal center and undergo activation-induced deaminase–mediated SHM, aiming to increase the immunoglobulin affinity for antigen. The rearranged VDJ is composed of seven regions: three complementary-determining regions (CDR1 to CDR3) arranged consecutively with four intervening framework regions (FR1 to FR4). CDRs are highly variable (hypervariable regions), and structurally they represent loops necessary for interaction with antigen, whereas FRs are structurally conserved β-sheets that flank and serve as a protein scaffold for the CDRs.12 The V region encompasses FR1, CDR1, FR2, CDR2, and FR3, whereas CDR3 includes the most distal part of V and all of the D and J regions and is known to harbor the most sequence variability of the immunoglobulin molecule.13 Nucleotide changes that result from SHM are known to preferentially target the CDRs.14 Primers used to amplify the rearranged VDJ segment of IGH have been specifically designed and standardized to target the less variable FRs15 or the highly conserved leader sequence upstream of FR116,17 rather than the highly variable CDRs to reduce PCR failures due to poor primer annealing.
Given the prognostic and predictive value of SHM in patients with CLL, consensus recommendations for testing and reporting have been formulated by the European Research Initiative on CLL (ERIC).17,18 The original 2007 guidelines established Sanger sequencing using either leader or FR1 primers as the gold standard method. However, testing uniformity across laboratories has not been achieved because of the technical complexity, labor intensiveness, and limited scalability of this method. In recent years, next-generation sequencing (NGS) has emerged as a suitable approach for SHM assessment, but guidelines do not yet address the required technical or interpretative parameters for its use in standard clinical practice. The 2017 and 2022 updated ERIC recommendations acknowledged that laboratories may be using NGS for determination of SHM status. However, further guidelines regarding this method are not provided. These recommendations emphasize using leader rather than the FR1 primers, which do not amplify the entire V region and are assumed to lead to inaccurate risk stratification.17,19 In addition, sequencing data analysis of CLL had recently elucidated that approximately 41% of patients harbor IGHV rearrangements that can be grouped into stereotyped subsets that are characterized by distinctive sequence motifs within CDR3, and these can be used to further refine the classification into prognostic subsets with similar biologic characteristics and clinical outcomes.20, 21, 22, 23, 24 Among 19 major stereotyped subsets with emerging prognostic importance, stereotyped subset 2 (IGHV3-21), for example, has a poor prognosis irrespective of SHM status. The ERIC recommendations also address the importance of evaluating the entire heavy chain CDR3 to determine if the productive rearrangement leads to inclusion to a major stereotyped subset. In addition, the NGS-characterized sequence of the VDJ segment represents a patient-specific marker with broad applicability in subsequent disease monitoring and measurable or minimal residual disease assessment.
This study describes the implementation of a commercially available NGS-based assay for routine IGHV sequence analysis in CLL. We concentrate primarily on clonotype detection and assessment of SHM status, comparing the performance of Ion PGM with MiSeq platforms and critically evaluating the use of leader primers versus FR1. We further describe our clinical experience with the assay in a large clinical implementation cohort.
Materials and Methods
Patients and Specimen Selection
Clinical samples from patients with CLL, based on the 2017 World Health Organization diagnostic criteria,1 received for routine assessment of IGHV SHM were selected for the study. The corresponding pathology reports were manually reviewed to confirm the diagnosis, and patients with more than one neoplastic B-cell/plasma cell process by flow cytometry (n = 44) were excluded to eliminate confounding IGHV assessment. The study was approved by the institutional review board and was performed in accordance with the Declaration of Helsinki. Genomic DNA was extracted from peripheral blood, bone marrow, or formalin-fixed, paraffin-embedded tissue using standard protocols. DNA quality and quantity were assessed using a Qubit dsDNA HS Assay Kit or by Quant-iT dsDNA Broad Range (Thermo Fisher Scientific, Waltham, MA) with a SpectraMax M2 fluorescence microplate reader (Molecular Devices, San Jose, CA) according to the manufacturer's instructions.
Initial Validation
Thirty-five samples from patients with CLL and known SHM status who had sufficient genetic material were selected for initial assay validation and platform comparisons. All cases were previously characterized by conventional methods for the presence of IGHV clonality and Sanger sequencing–based SHM assessment by a clinically validated assay (Cancer Genetics, Inc., Rutherford, NJ) and in-house testing (subset) using the IGH Somatic Hypermutation Assay version 2.0 (Invivoscribe, Inc., San Diego, CA) following the manufacturer's protocol. NGS testing was performed on each sample using LymphoTrack assays (Invivoscribe, Inc.) formatted for the Illumina MiSeq (Illumina, Inc., San Diego, CA) (leader and FR1 primer sets) and Ion Torrent PGM (Thermo Fisher Scientific) (FR1 primer set) platforms as described below (summarized in Supplemental Table S1).
Library Preparation and NGS
Library preparation used 50 to 250 ng of genomic DNA (gDNA) and commercially available primers (Invivoscribe, Inc.) targeting the IGH FR1 region (FR1) or the conserved leader sequence upstream of FR1 (leader) following the manufacturer's protocols. Briefly, amplification by PCR was performed using the specified master mixes that contained primers designed with 24 (MiSeq) or 12 (PGM) barcoded sequence adaptors. After purification and quantification, libraries generated from LymphoTrack PCR master mixes for Illumina platforms were sequenced on an Illumina MiSeq Instrument with paired-end sequencing at read lengths of 2 × 250 or 2 × 300 base pairs (bp) for FR1 and leader primer sets, respectively, whereas libraries generated from PCR master mixes for PGM were sequenced using the Ion PGM Template OT2 400 kit and Sequencing 400 kit with Ion 316 Chip on Ion PGM. Standard quality control metrics were considered optimal for MiSeq Illumina and PGM runs (Supplemental Figure S1). No strict cutoff was imposed on the total output for the run. However, at the sample level, coverage of <30,000 reads was deemed inadequate for interpretation for MiSeq sequenced samples, as previously established during prior assay validation,25 and for PGM it was based on the manufacturer's recommendations: failure when <15,000 reads if the top sequence(s) accounted for ≥2.5% of total reads and <10,000 reads if the top sequence(s) was ≥5% of total reads. Additional details of the performance characteristics (limit of detection, precision, and reproducibility) of the NGS assays are summarized in Supplemental Tables S2 and S3.
Data Analysis
Sequencing results were analyzed using the LymphoTrack IGH SHM MiSeq version 2.3.1 or version 2.4.3 or PGM Software version 2.3.1 (Invivoscribe, Inc.), where appropriate. In addition, for the prospective clinical implementation, an in-house developed analysis pipeline (which uses IMGT-based alignment)25,26 was used for streamlined evaluation and visualization. Identification of dominant clonal sequence(s) was based on criteria developed during assay validation.25 Briefly, following merging of similar sequences based on the amplicon size, V-J uses, and sequence identity, dominant sequences accounting for >5% of the total reads and representing >10 times the polyclonal background were identified as clonal. Sequences between 2.5% and 5% were considered clonal if present at >20 times above a rich polyclonal background. Sequences were also annotated as predicted to be productive or unproductive.
SHM status was evaluated based on conventional criteria according to latest EuroClonality/BIOMED-2 guidelines27 and the updated ERIC recommendations in CLL.17,19 A mutation rate >2% compared with germline IGHV sequence was defined as mutated, whereas 0% to 2.00% was defined as unmutated. Quantification of the mutation rate was performed by the LymphoTrack IGH SHM software, which compares the IGHV sequence with the consensus germline sequence in the IMGT/V-QUEST database28 (International Immunogenetics Information System, http://www.imgt.org, last accessed June 10, 2022), and the percentage of base pair mismatches from the germline sequence was calculated. Mutation rate was also manually confirmed using IGBLAST29 (National Center for Biotechnology Information, https://www.ncbi.nlm.nih.gov/igblast, last accessed June 10, 2022). Before analysis, the sequence was trimmed, and then the entire amplified sequence generated using leader or the FR1 multiplex primer mixes was used for the comparison to germline, including CDR3.
In Silico Evaluation of Mutation Distribution in Cases Evaluated by Leader Primers
To analyze the distribution of mutation bias along different IGHV regions (FR1, CDR1, FR2, CDR2, FR3, and CDR3), clonotype sequences generated by leader were evaluated by the IGBLAST web-based tool. The percentage of homology for each region was calculated by counting the number of nucleotide differences between the 5′ end of FR1 and the 3′ end of CDR3 of the IGHV sequence. This analysis was restricted to cases with any degree of mismatches from the germline sequence (mutation rate >0%). The mutation rate for each IGHV region was assessed as well as the overall effect on the mutation rate of removing in silico in the entire FR1 region. Results were compared with the mutation rate generated by leader and FR1 primers among the validation set.
Statistical Analysis
Platforms were compared by the Bland-Altman method. Statistical data were analyzed using GraphPad Prism software version 7.0 (GraphPad Software, San Diego, CA). For continuous variables, data were reported as means and ranges and compared with the t-test. The Fisher exact test or χ2 test was used for categorical comparisons, as appropriate. All P values were two-tailed and were considered significant when <0.05.
Prospective Assay Implementation
Following the initial validation, 423 consecutive clinical CLL samples submitted between April 2016 and September 2019 for routine SHM assessment were analyzed on Illumina MiSeq using the FR1 and/or leader primer sets. Initially, 470 samples were identified, but 47 were excluded for the following reasons: i) harbored more than one immunophenotypically distinct abnormal B-cell population by flow cytometry, including those with biphenotypic CLL (n = 26); ii) presence of additional lymphoid neoplasms (n = 18); or iii) patients with monoclonal B-lymphocytosis, not fulfilling the diagnostic World Health Organization criteria for CLL/SLL (n = 3). In cases where the clonotypic sequence identification was ambiguous or unproductive or the mutation rate was borderline (2% to 3%), testing with the alternate primer mix was also performed.
IGH CDR3 Region Stereotypy Determination
All productive clonotypes generated by both leader and FR1 primer sets (a total of 435 from 418 patient samples) were analyzed using the publicly available web-based ARResT/AssignSubsets tool (ARResT/AssignSubsets, http://tools.bat.infspire.org/arrest/assignsubsets, last accessed April 23, 2022).30 The primary output provided was used without any further modifications, and no further reassignment was performed to the nearest subset if the sequence was unassigned. Circos plots depicting the frequency of the immunoglobulin heavy chain (IGH) VJ family combinations among these sequences were generated in R version 4.0.2 (https://posit.co/download/rstudio-desktop) using the circlize package (R Foundation for Statistical Computing, Vienna, Austria).31
Results
Validation Cohort
Detection of the Dominant Clonotype and Its SHM Status Overall Shows Excellent Agreement by NGS Compared With the Reference Method
The summarized comparison of platforms and primers is depicted in Figure 1A. In combination, FR1 and leader primers reached 100% agreement with the reference method in their ability to detect the dominant clonotypic sequence and qualitative SHM status. Independently, there were three major discrepancies with the consensus established by the reference methods: case 1: failure by leader primer due to low overall read count (however, the dominant clonotype among available reads and SHM status are in agreement with the reference); case 40: sequencing error in two homopolymer regions by PGM, leading to a discrepant interpretation of the dominant sequence (unproductive by PGM and productive by all others, alignment shown in Supplemental Figure S2); and case 48: SHM of the major clonotype interpreted as unmutated (mutation rate, 1.76%) by FR1 primers (MiSeq and PGM), unambiguously mutated by the reference methods (5.2%), and borderline by leader (2.36%) (alignment shown in Supplemental Figure S3). Minor discrepancies in four cases (cases 6, 12, 38, 39) resulted from the detection of additional minor clonotypes by NGS, which could not be detected by Sanger sequencing because of low sensitivity and the inability to resolve clonotypes of similar sizes. Conversely, in three cases (cases 14, 28, and 47) the minor discrepancy was due to the additional subclonal sequences being detected by the reference methods, whereas it was present but below quantitative criteria in some of the NGS-based results. Details of all discrepancies are summarized in Supplemental Table S1.
Figure 1.
Comparison of platforms and primers using 35 validation chronic lymphocytic leukemia/small lymphocytic lymphoma samples. A: Summary of sequencing results by framework region (FR) 1 PGM, FR1 MiSeq, leader MiSeq, and the reference method. The bottom panel shows color-coded qualitative results as explained in the key. The top panel shows bar graphs corresponding to the percentage of total reads for each clonal sequence detected. B: The detection of clonal sequences using the FR1 primer set with PGM versus MiSeq platforms was compared. C: The MiSeq platform was used to compare output by the FR1 and leader primer sets (excluding case 1, which failed sequencing by leader). Major discordant results were seen when there was a failure by one assay and not others (case 1 failed sequencing by leader based on the total reads), when the dominant sequence did not meet quantitative criteria above background for unequivocal clonotype reporting, or if the identified clonotype showed significant qualitative differences [productive versus unproductive or mutated (+) versus unmutated (0)]. Exact results were seen when the sequence identity and the number of sequences detected that fulfilled criteria for clonality were identical. Cases with minor discordance showed additional minor clonal sequences that met clonality criteria by one primer set or platform and not the other, and these are designated as detected extra sequences. F, failure; ND, not detected; NR, not reported.
Platform Performance: Illumina MiSeq Versus Ion PGM Platform Comparison Shows Similar Clonotype Detection Sensitivity and Identical Mutation Rates
Platforms were compared using the FR1 primer set only. The choice of FR1 was dictated by the inherent chemistry of the Ion PGM platform at the time of this validation, which only allowed sequencing of shorter fragments compared with MiSeq and lacked primers that targeted the leader region due to long amplicon size. All samples were successfully sequenced using both platforms with sequencing characteristics as summarized in Figure 1A and Supplemental Table S4. On the basis of the recommended manufacturer's protocols, samples sequenced on the MiSeq platform generated a significantly higher number of total reads compared with the PGM (495,060 versus 191,293) (Supplemental Table S4), as expected because of the inherently higher sequencing output by MiSeq and similar frequencies of the dominant clonotype. Because of the inherent differences in chemistries and raw data processing by the two platforms, base quality parameters are difficult to compare directly: samples sequenced on MiSeq had a mean Q30 score (percentage of bases with a quality score of ≥30) of 87.5%, whereas samples sequenced on PGM had a mean Q20 score (percentage of bases with a quality score of ≥20) of 92.5% (Supplemental Figure S1), with Q30 scores consistently below 80%.
In all, among the 35 cases sequenced, 53 independent clonotypic sequences were detected by the two platforms in combination (40 productive and 13 unproductive), including 13 of 35 patient samples (37%) with more than one dominant clonotypic sequence. Each platform detected 51 sequences, and 48 were identical between the two (detailed in Figure 1A). The overall concordance was 97% (n = 34/35) with respect to the dominant productive clonotype and the SHM status of the sample (Figure 1B), with the single discordant result (case 40, homopolymer error) as discussed above. Minor discrepancies were identified in two cases in which additional minor clonotypes (subclonal sequences) met quantitative reporting criteria by only one of the two platforms (case 38: extra by PGM; case 39: extra by MiSeq) (Figure 1A). In both cases, the clonal sequences could be identified in the alternate platform when specifically searched for but did not satisfy quantitative criteria for reporting. Comparison of SHM rates for all the matching clonal sequences (48 sequences total) were identical (Supplemental Figure S4) between these two platforms.
Primer Set Performance: Leader and FR1 Primer Sets Show Similar Clonotype Detection Sensitivity and Similar Mutation Rates
The 35 validation samples were sequenced on the MiSeq with both leader and FR1 primer sets to compare their performance characteristics (Figure 1A and Supplemental Table S1). All samples were sequenced successfully except for one (case 1), which failed leader sequencing due to low total number of reads (<30,000), although the same dominant clonotype was present below reporting criteria with a concordant qualitative SHM status. Overall, the mean total number of reads generated by each primer set were comparable (FR1: 495,060; leader: 541,749) (Supplemental Table S4), and sequence quality metrics were similar: FR1 sequencing with a mean Q30 score of 87.5% compared with 86.0% with leader (Supplemental Table S4 and Supplemental Figure S1). As expected, the clonotype sequences amplified by leader were longer (range, 461 to 518 bp) compared with FR1 (range, 266 to 340 bp). Among the 35 cases, a total of 53 clonal sequences were identified using the two primer sets in combination (FR1 detected 51 and leader 48), and 46 were identical between the two (detailed in Figure 1A). In all, 15 of 35 cases (43%) harbored >1 dominant clonotype (range, 2 to 3). For the 34 cases successfully sequenced with both primer sets, there was 100% concordance (n = 34/34) in their ability to detect the top dominant clonotype (Figure 1C). Minor discrepancies were identified in 6 of 34 cases (18%: cases 14, 17, 28, 29, 38, and 47); all were due to the detection of additional minor subclonal sequences that met reporting criteria by one primer set but not the other (four by FR1 and two by leader) (Figure 1C), and these differences were not statistically significant (Fisher exact test) and may represent amplification bias.
The mutation rate data generated by the two primer sets (FR1 and leader) were compared for all clonotypes detected by both primer sets (46 total), including productive (n = 39) and unproductive (n = 7) rearrangements. The overall agreement was excellent, showing a strong correlation (R2 = 0.9837, P < 0.0001) (Figure 2A). However, a Bland-Altman analysis of the differences versus means in the two mutation rates revealed a systematic bias with a slight but consistently higher mutation rate in the FR1 output compared with leader (bias mean = +0.326; 95% limits of agreement, −0.79 to 1.44) (Figure 2B), and likely this reflects the differing FR1 lengths captured by the leader and FR1 primers. Despite the bias, stratification into unambiguously mutated versus unmutated SHM categories was not affected for nearly all [n = 45/46 (98%)] sequences (Figure 2C) except in one dominant productive clonotype (sample 48) with a borderline somatic hypermutation rate near 2% (alignment shown in Supplemental Figure S3), as discussed above.
Figure 2.
Validation study: comparison of mutation rates by framework region (FR) 1 and leader primer sets for all matching clonotypes (n = 46) identified by sequencing on the MiSeq platform. A: Scatterplot showing the correlation and linear regression analysis of mutation rates (r = 0.9918; 95% CI, 0.9852 to 0.9955; R2 = 0.9837; P < 0.0001). The best fit linear regression line (black line) and 95% CIs (red dashed lines) are showns. B: Bland-Altman plot showing a difference versus mean analysis of the mutation rates identified by the two primer sets (FR1 versus leader) with a mean ± bias of 0.326 ± 0.6644 and 95% limits of agreement of −0.79 to 1.44 (red dotted lines). The green line represents if the difference between the assays was zero (0). C: Histogram showing frequency distribution of mutation rates grouped by bins: 0%, 0.1% to 1.4%, 1.5% to 1.9%, 2.0% to 3.0%, and >3.1%. The two sets of bars correspond to results obtained by leader (black bars) and FR1 (gray bars) primer sets. D: Clinical implementation: histogram showing frequency distribution of mutation rates grouped by bins as in C. The mutation rate bins of all productive rearrangements detected in samples by leader or FR1 sequencing (435 productive clonotypes in 423 samples) are shown. Among the 228 productive rearrangements by leader and 207 by FR1 primer sets, the distribution of the mutation rates showed no statistically significant differences (χ2). Clonotypes with borderline somatic hypermutation rate (2.0% to 3.0%) account for a small proportion of the total productive rearrangements [overall, 3.4% (n = 15/435)], detected at similar rates by both primer sets: 2.6% (n = 6/228) for leader and 3.4% (n = 7/207) for FR1.
To assess the impact of sequencing the entire FR1 region (using leader primers) versus partial or no FR1 (FR1 primers), we performed a series of in silico analyses focusing on the matching productive clonal sequences (39 clonotypes from 34 samples) detected by both primer sets, given the established clinical significance of the productive IGHV sequences.2,18 The length of FR1 when sequenced with the FR1 primer set averaged 6.5 bp (range, 3 to 10 bp), which represents 8.7% of the total FR1 length, whereas leader allowed for sequencing of the entire FR1 region in all instances (75 bp) (Table 1). The distribution of mutations along each of the IGHV regions (FR1, CDR1, FR2, CDR2, FR3, and CDR3) was evaluated from leader-amplified sequences with <100% identity compared with germline. As expected, the FR1 region showed the lowest rate of mutations (mean, 2%; range, 0% to 8%), whereas CDR2 and CDR1 contributed a larger percentage of the total mismatched bases per region (Supplemental Figure S5). In silico removal of the entire FR1 region from the intact leader output (using 39 clonotypes) generated mutation rates that were similar to the intact leader results (Supplemental Figure S6A), showing excellent correlation and linear regression (r = 0.9867; 95% CI, 0.975 to 0.993; R2 = 0.974; P < 0.0001). The Bland-Altman analysis shows that complete removal of the FR1 region leads to consistently higher mutation rates compared with the intact leader-generated sequences (mean bias, +0.480; 95% limits of agreement, −0.945 to 1.90) (Supplemental Figure S6B); however, this bias had no impact on assigning sequences to unmutated, borderline, and mutated categories (mutation rates, ≤1.9%, 2% to 3%, or >3%, respectively) (Supplemental Figure S6C).
Table 1.
Mean Length and Proportion of the Total Framework Region (FR) 1 in the 39 Productive Rearrangements Detected by Both Primer Sets in the Validation Cohort
| Measure | FR1 primer set | Leader primer set |
|---|---|---|
| Mean total FR1 length (range), bp | 6.5 (3–10) | 75 (75–75) |
| Mean percent of total FR1 region (range) | 8.7 (4–13.3) | 100 (100–100) |
| Total length of sequenced V region, mean (range), bp | 227 (219–236) | 295 (288–302) |
Prospective Clinical Implementation of NGS-Based Assessment of IGHV and SHM in CLL: FR1 and Leader Primer Set Comparison
Having established an overall excellent performance of the NGS-based assay for IGHV clonotypic sequence detection and SHM status assessment, we implemented its use in routine clinical practice for prognostic evaluation of patients with CLL. The MiSeq platform based on our laboratory requirements was used for throughput, overall superior performance with homopolymer regions, and the ability to use the leader primer set. To facilitate streamlined sequencing of the CLL samples with other clinical samples (for clonal characterization and minimal residual disease assessment), some of the samples were sequenced with FR1 primers upfront and resequenced with leader primers if borderline SHM results were encountered (rates of 2% to 3%) or to clarify difficult cases (including those with mutation rates between 1.5% and 2%).
A total of 423 unique clinical CLL samples were evaluated prospectively and sequenced upfront using leader [215 cases (51%)] or FR1 primer sets [208 cases (49%)]. The testing scheme is explained in Figure 3, and the sequencing characteristics and results are summarized in Table 2, including samples that were sequenced with both primers (described below), such that the results reflect ultimately the most informative primer used: 218 cases by leader and 205 by FR1. The mean total number of reads generated by FR1 was higher than by leader analysis (mean, 759,870 versus 562,757; P < 0.0001) (Table 2), likely due to higher efficiency of amplification of the smaller FR1-generated amplicons, whereas other parameters were similar. Among all 432 cases, 36 (9%) were sequenced with both primers to clarify the initial results of challenging cases, including sequencing failure of the initial primer set [1 (0.2%)], borderline SHM rate [10 (2.4%)], failure of the initial primer set to detect a productive rearrangement [3 (0.7%)], and absence of a clonotypic rearrangement [polyclonality in 8 (1.9%)], or to clarify the presence or suspicion of any additional minor clonal sequences [14 (3.3%)] (Figure 3).
Figure 3.
Flowchart of the analysis strategy of the 423 clinical chronic lymphocytic leukemia/small lymphocytic lymphoma samples evaluated by the framework region (FR) 1 and leader primer sets on the MiSeq platform. Thirty-six challenging samples were evaluated by both primer sets and based on the success of the analysis as outlined above; cases were reassigned to the alternate primer set when appropriate. The bottom row indicates whether successful identification of at least one productive clonal sequence per case was observed. Asterisk includes one case (case 98 in Figure 4) in which the reflexed leader analysis was not informative because it did not detect a major clone (polyclonality) and was discordant from FR1 results.
Table 2.
General Comparison of the Outputs by FR1 and Leader Primer Sets of the Clinical Implementation Cohorts (MiSeq Platform) Based on the Analysis Strategy Outlined in Figure 3
| Measure | Overall (n = 423) | Leader (n = 218) | FR1 (n = 205) | P |
|---|---|---|---|---|
| Sex, female/male, n (% female) | 154/269 (36) | 75/143 (34) | 79/126 (39) | NS |
| Age, mean (range), years | 63.9 (27–93) | 62.8 (27–90) | 64.9 (36–93) | NS |
| Sample type, n (%) | ||||
| Blood/bone marrow | 348 (82)/45 (11) | 189 (87)/10 (4.5) | 159 (78)/35 (17) | NS |
| FFPE | 27 (6) | 18 (8) | 9 (4) | |
| Other | 3 (1) | 1 (0.5) | 2 (1) | |
| Total reads, mean (range), n | 658,285 (30,958–2,451,851) | 562,757 (32,973–1,294,699) | 759,870 (30,958–2,451,851) | <0.0001 (t-test) |
| Cases with clonal sequences, n (%) | ||||
| 0 | 5/423 (1.2) | 1∗/218 (0.5) | 4/205 (2) | NS |
| 1 | 337/423 (79.7) | 177/218 (81.2) | 160/205 (78) | |
| 2 | 79/423 (18.7) | 39/218 (17.9) | 40/205 (19.5) | |
| 3 | 2/423 (0.5) | 1/218 (0.5) | 1/205 (0.5) | |
| Total clonotypes detected, n | 501 | 258 | 243 | NS |
| Total productive clonotypes, n detected | 435 | 228 | 207 | NS |
| Major productive clonotype, n (%) | ||||
| Mutated (M) | 192/418 (46) | 105/217 (48) | 87/201 (43) | NS |
| Unmutated (UM) | 226/418 (54) | 112/217 (52) | 114/201 (57) | |
| Cases with additional rearrangement(s), n (%)† | ||||
| Productive (P) | 17/81 (21) | 11/40 (27.5) | 6/41 (15) | NS |
| Unproductive (U) | 64/81 (79) | 29/40 (72.5) | 35/41 (85) | |
| M | 34/81 (42) | 19/40 (47.5) | 15/41 (37) | |
| UM | 47/81 (58) | 21/40 (52.5) | 26/41 (63) | |
| Cases with single productive rearrangement, n (%) | ||||
| M | 162/337 (48) | 86/177 (49) | 76/160 (48) | – |
| UM | 175/337 (52) | 91/177 (51) | 84/160 (52) | – |
| Double rearrangement (>1 clonotype), n (%) | 81/423 (19) | 40/218 (18) | 41/205 (20) | |
| P + U | 64/423 (15) | 29/218 (13) | 35/205 (17) | – |
| Mutation status concordant | 59/64 (92) | 28/29 (97) | 31/35 (89) | – |
| P,M + U,M | 20/59 | 12/36 | 8/31 | – |
| P,UM + U,UM | 39/59 | 16/36 | 23/31 | – |
| Mutation status discordant | 5/64 (8) | 1/29 (3) | 4/35 (11) | – |
| P,UM + U,M | 5/5 | 1/1 | 4/4 | – |
| P,M + U,UM | 0 | 0 | 0 | – |
| Double productive (PP) | 15/423 (3.5) | 10/218 (4.6) | 5/205 (2.4) | NS |
| Mutation status concordant | 13/15 | 9/10 | 4/5 | – |
| Both M | 8/13 | 6/9 | 2/4 | – |
| Both UM | 5/13 | 3/9 | 2/4 | – |
| Mutation status discordant | 2/15 | 1/10 | 1/5 | – |
| Triple rearrangement | 2/423 (0.5) | 1/218 (0.5) | 1/205 (0.5) | – |
| Follow-up sample analyzed | 16/81 | 6/40 | 10/41 | – |
| Clonotype, % of total reads (range)‡ | – | |||
| All clonotypes detected | 43 (2.5–92) | 34 (3.5–74.7)∗∗∗ | 52 (2.5–92)∗∗∗ | <0.0001 |
| Major productive | 46 (2.5–92)§ | 37 (2.7–75)∗∗∗§ | 57 (2.5–92)∗∗∗§ | <0.0001 |
| Additional clonotype(s) | 23 (3.2–63)§ | 19 (3.6–63)∗∗§ | 28 (3.2–59)∗∗§ | 0.0026 |
The χ2 test was used to evaluate statistical significance of categorical variable. The t-test was used for continuous variables.
∗∗P ≤ 0.01; ∗∗∗P ≤ 0.0001.
FFPE, formalin-fixed, paraffin-embedded; FR1, framework region 1; NS, not statistically significant.
Single unproductive only.
Including case(s) with three clonotypes.
The P values provided in the table reflect a comparison of leader versus FR1.
Statistically significant difference seen between percentage of reads of the major and additional clonotypes (P < 0.0001) within both primer sets and overall.
In all, 501 distinct clonotypic sequences were detected (435 productive and 66 unproductive) (Table 2) across the cohort. Most commonly [337 of 423 cases (80%)], only a single productive clonal IGHV rearrangement was identified [leader, 177 of 218 (81.2%); FR1, 160 of 205 (78%)]. Multiple IGH clonal rearrangements (two or three) were identified in 81 of 423 cases (19%), detected at similar proportions by leader [40 of 218 (18%)] and FR1 [41 of 205 (20%)] sequencing. Cases with 3 clonal sequences [2 of 423 (0.5%)] were rare, all harboring two productive and one unproductive sequence, suggestive of a biclonal with biallelic combination. Most cases with double rearrangements (two clonotypes) had a combination of a productive with an unproductive (P+U) rearrangement [62 of 423 (15%)], whereas fewer [15 of 423 (3.5%)] had two productive clonotypes (PP). In 72 of 79 cases (91%), both clonotypes shared the same SHM status (both mutated or both unmutated). Discordant SHM status of the two clonotypes was seen in 5 of 64 P+U cases (7.8%) and 2 of 15 PP cases (13%). Among the 81 cases with multiple rearrangements, the nondominant clonal sequences were identified at a significantly (P < 0.0001, t-test) lower proportion of total reads (24%; range, 3.2% to 63%) compared with the major productive rearrangement (46%; range, 2.5% to 92%); therefore, the biology of the IGHV rearrangement (biclonal versus biallelic versus amplification bias) could not be definitively determined. The presence of multiple rearrangements was observed in follow-up patient samples (available from 16 of 81 patients) (Supplemental Table S5), which showed identical double IGHV clonotypic sequences, confirming the initial findings. Across all the productive rearrangements sequenced by leader (228) or FR1 (207), the distribution of mutation rates showed no statistically significant differences between these two primer sets, including clonotypes with borderline SHM rates (2.0% to 3.0%), which were rare [6 of 228 (2.6%) for leader and 7 of 207 (3.4%) for FR1] (Figure 2D).
The large clinical cohort confirmed the findings of the validation set. The FR1 region was variably captured, depending on the primer used, with a mean of 9% versus 100% for the FR1 and leader primers, respectively (Supplemental Table S6). For cases sequenced with leader, the distribution of mutation rates among the various regions confirms that the FR1 region generally harbors the lowest rate of mutation (mean, 3%; range, 0% to 12%) compared with CDR1 and CDR2, which account for a larger proportion of the total mismatched bases per region (Supplemental Figure S7), explaining why partially sequencing the FR1 region does not critically affect the SHM rates.
The overall success rate for detecting at least one productive clonotypic sequence by leader or FR1 primer sets was similarly high (difference not statistically significant): 97.7% by leader (210 of 215) and 94.7% by FR1 (197 of 208), including samples evaluated by single primers and those with concordant results when sequenced with both primer sets. Combination of both primer sets showed a 98.8% (n = 418/423) success rate. Among the 36 problematic samples that required evaluation by both primer sets (Figure 4), 24 (67%) showed concordant results (19 cases with matching clonotypes and 5 cases with no productive clonotypes detected), 9 (25%) showed a major discrepancy (a dominant productive rearrangement reportable by only one primer set), and 6 (16%) showed a minor discrepancy in the form of additional rearrangements reportable by one but not the other primer set or disagreement of the formal binary SHM status assignment due to borderline mutation rates. Descriptions of all discrepancies are given in Supplemental Table S7.
Figure 4.
Thirty-six cases from the clinical implementation cohort evaluated by both framework region (FR) 1 and leader primer sets. Summary of sequencing results by FR1 MiSeq and leader MiSeq. The bottom panel shows color-coded qualitative results as explained in the key. The top panels show bar graphs corresponding to the percentage of total reads for each clonal sequence detected by the two primer sets. The numbers at the top of each column correspond to the unique sample identifier. F, failure; ND, not detected; NR, not reported.
In all, among the samples evaluated by both leader and FR1 primer sets (Figure 4), 30 matching clonotypes were sequenced from 24 samples. Comparison of the mutation rates generated by both primer sets for the matching productive sequences (23 of 30 clonotypes) showed excellent agreement, with high correlation and linear regression (r = 0.9933; 95% CI, 0.9603 to 0.9930; R2 = 0.9669; P < 0.0001) (Supplemental Figure S8). As in the validation cohort, the Bland-Altman analysis revealed a systematic bias in the FR1 output, with a slightly higher mutation rate compared with leader-based sequencing (mean, +0.31; 95% limits of agreement, −0.42 to 1.04), leading to slight differences among the distribution into mutation rate bins (Supplemental Figure S8). For a minor subset of cases [3 of 36 (8%)] with a borderline SHM status (2% to 3%, cases 100, 149, and 191) that were initially tested by FR1 (Figure 4), resequencing with leader changed the formal binary SHM status assignment; this was very rare when considering the entire cohort of cases successfully evaluated [3 of 418 (0.7%)].
Among the 435 identified productive clonal sequences, the use of the IGHV and IGHJ gene families was evaluated (summarized in Supplementary Figure S9), with IGHV3 and IGHJ4 being the most common among this cohort, as has been previously published in studies of CLL.18,20 In addition, the NGS-generated sequences were further analyzed to identify clonotypes that can be assigned to the 19 major stereotyped subsets based on the CDR3 sequence evaluation. Analysis of our cohort using the ARResT/AssignSubsets tool30 (summarized in Supplemental Figure S9) shows that 41 of 435 sequences (9.4%) can be assigned to one of these subsets, which is similar to the approximately 12% to 13% reported in the literature,20,22,24 and this can be used to provide additional prognostic information.
Discussion
Despite the attractiveness of the NGS-based approach, standardization of both the technical and interpretive aspects of the technology remains lacking, limiting a concerted action to implement this as the method of choice across laboratories. Several studies, however, have reported on the feasibility of IGHV gene clonality and repertoire assessment by NGS,32, 33, 34 including a recent study from our group that described in detail the validity, efficiency, and sensitivity of our method, which relies on a commercially available assay based on the BIOMED-2 primer design.25 In this follow-up study, we describe our validation and clinical experience in the routine assessment of CLL samples for detection of a clonal rearrangement, SHM, and stereotypy analysis. To this end, a detailed comparison of the two most common platforms used in clinical laboratories (Ion PGM and Illumina MiSeq) was performed as well as critical assessment of testing by leader versus FR1 primer sets.
Taken together, the platform comparison with FR1 primers shows an overall better sequencing performance of the Illumina MiSeq, generating higher data throughput and superior data quality with a more streamlined workflow compared with Ion PGM. In keeping with well-known technical limitations of the PGM chemistry,35 a slightly higher rate of sequencing errors was observed in homopolymer regions (1 of 35 cases among the validation samples), which led to an error of a dominant productive clonotype being misidentified as unproductive. Overall, across the validation samples, the error rate was low (3%) and generally did not affect the dominant clonotype calls. Importantly, both NGS platforms delivered identical SHM rates (100% concordance) for all the concordant rearrangements detected. Ion PGM chemistry, however, has a limited length of sequencing that affects the availability of leader-based analysis. Another limitation of Ion PGM compared with Illumina MiSeq is fewer available bar-coded adaptors (12 versus 24 indexes, respectively), leading to fewer samples that can be assayed per run. However, the total assay time is significantly shorter for Ion PGM compared with Illumina MiSeq (21.5 versus 48 hours total, respectively), which may be more suitable in certain laboratory settings. The lower total sequencing capacity of Ion PGM compared with Illumina MiSeq also limits other applications of this assay, such as its use in minimal residual disease assessment. Overall, for the purposes of SHM status, both platforms are suitable, and the decision should be based on the needs of the laboratory for throughput and technology availability.
This comparison of FR1 and leader primer assays for their detection of the productive IGH rearrangement and determination of SHM status was an important consideration for the successful transition to NGS repertoire assessment in our laboratory and the seamless incorporation of SHM testing in day-to-day operations. Although current ERIC recommendations urge the use of only leader primers for evaluation of SHM, under the premise that the entire FR1 region needs to be sequenced for accurate risk stratification, we find that this has important workflow implications. Paired-end sequencing with read lengths of 2 × 300 bp (600 cycle) not only increases sequencing time and overall cost of the assay but also has an impact on the quality metrics of Illumina sequencing and the number of samples that can be sequenced in the same run. In our clinical laboratory, most of the clonal characterization and monitoring of plasma cell and lymphoid neoplasms is performed with IGH FR1 and less frequently with IGH FR2, IGH FR3, and T-cell receptor γ primer sets, all of which are performed with 500-cycle reagent kits (version 2). Although it is possible to multiplex any combination of LymphoTrack IGHV and T-cell receptor γ assays together in a single sequencing library, adding samples analyzed by leader always requires sequencing with the longer 600-cycle kits (version 3). This leads to oversequencing of shorter fragments and ultimately affects the overall quality of the run, calling for a reduction in the number of samples being tested in each run and limiting the throughput in the laboratory. Conversely, the alternate approach of running CLL samples alone in dedicated runs causes similar delays in turnaround time as a result of the longer sequencing and temporal spacing of the runs to batch sufficient samples for independent runs.
FR1 and leader primers demonstrated equally high success in the detection of the productive rearrangement [leader, 97.7%; FR1, 94.7% (not significant); combination of both, 98.8%] and the determination of the SHM status (Figure 3). The correlation of the calculated SHM rate with both primer sets was very strong (correlation coefficient R2 = 0.984, P < 0.0001), which is consistent with prior reports using laboratory-developed and EuroClonality/BIOMED-2 primers with gold standard multiplex PCR-based sequencing protocols for SHM.36 Close examination identified a consistent overestimation bias in SHM with FR1 sequencing (+0.33%); however, this difference is small because of the low mutation rate of this specific region compared with the other CDRs and FRs. This finding is in agreement with reported observations in a French cohort of 74 patients with CLL36 as well as other studies, showing that physiologic SHM in nonneoplastic B cells is preferentially targeted to CDRs compared with the FRs.37, 38, 39 Because sequencing with FR1 primers only captures a small portion of FR1 (range, 4% to 13% in the current cohort), the calculated mutation rate is higher because of the smaller denominator and exclusion of the region with the lowest mutation rate. Despite the difference and even with complete removal of the FR1 in the in silico experiments, the ultimate stratification of a clinical sample as mutated, unmutated, or borderline is not affected in the overwhelming majority of cases. Among the validation samples, a discrepant SHM status was seen in a single clonotype with a borderline mutation rate near 2% (validation sample 48) (Figure 1A), and intriguingly the mutation rate was lower by FR1-based sequencing (1.76%) compared with leader (2.36%), suggesting that the bias may not always be predictable. In the clinical cohort, 405 of 418 successfully analyzed cases (96.9%) were characterized by mutation rates in the unambiguous SHM status category (≤1.9% and >3%, representing unequivocally unmutated and mutated status, respectively). Although additional sequencing with leader among cases with a borderline SHM rate [10 of 418 (2.4%)] provided a more accurate calculated mutation rate across the entire variable region, the overall binary SHM status assignment was altered in only 3 cases (0.7%). On the basis of a comparison of leader and FR1 sequencing, we conclude that, in a careful consideration of laboratory workflows, sequencing with FR1 primers may be considered if leader testing is not available or possible. Leader testing should be performed in all FR1-sequenced cases that reveal a borderline SHM status (near the 2% mutation rate cutoff), when a clonotypic rearrangement is not detected, or in difficult cases for other reasons. Importantly, a combined approach of sequencing by both primer sets in problematic cases is highly valuable and reduced the failure rate from 3.1% (n = 13/423) to 1.2% (n = 5/423), supporting the validity of this approach.
Furthermore, the high success rate of detecting a productive clonotype, averaging approximately 97% with individual primers, reiterates the superiority of NGS compared with traditional multiplex PCR-based methods that reach only approximately 85% for conventional methods based on published literature.36 Successful detection was further improved with the combination of FR1 and leader sequencing in select cases, reaching 98.8% for NGS versus 93% by conventional methods.
The current study also detected a slightly higher proportion of cases with more than 1 clonotypic sequences per sample [81 of 423 (19.1%)] compared with previous reports that enriched for neoplastic B cells (12% to 14.8%.40,41 This finding can be explained by the higher sensitivity of the NGS-based method even without enrichment for the abnormal B-cell populations. Most double rearrangements were predicted to be P+U [15% overall and 79% of multiply rearranged samples (n = 64/81)], which may represent biallelic rearrangements. Less commonly, PP rearrangements were found in 15 of 423 cases (3.5%), a rate in agreement with prior studies that relied on traditional analysis using CLL mRNA/cDNA (3.8%,41 3.1%42) and gDNA (2.1%41). We note, however, that the prevalence for PP rearrangements would likely be higher in the current study if patients with biphenotypic CLL by flow cytometry (approximately 6%), which has been reported in CLL,43 were included. The study design does not allow us to conclusively determine whether double IGHV rearrangements are biallelic or biclonal, both of which have been demonstrated in CLL by single-cell analysis.44 A limitation of this study is the inability to assess whether these secondary rearrangements exist at the mRNA level. Of note, although one would be tempted to conclude a biallelic nature of these clonotypes based on the clonotype read frequencies, amplification biases are common in PCR-based methods, and clonotype proportions should be interpreted with caution. Overall, FR1 sequencing showed a higher efficiency of amplification (higher total reads and higher proportion of major and secondary clonotypes) compared with leader; however, the similar success rate of clonotype detection suggests that the approach used for clonotype analysis and reporting is valid for both primer sets.
In most of our cases with multiple rearrangements, the mutation status was concordant [59 of 64 P+U cases (92%) and 13 of 15 PP cases (87%)], which is similar to the published assessment at the cDNA and gDNA levels by traditional methods [326 of 350 P+U cases (93%) and 56 of 85 PP cases (65%)],41 including studies that only examined productive sequences42,45 and those that did not distinguish the functionality.40 Among the P+U clonotypes, discordant mutation status was uncommon [5 of 64 (8%)], revealing only P,UM + U,M, which was similar to published findings [24 of 350 (7%)].41 Discordant mutation status among PP clonotypes in the current cohort was less frequent than in published studies [2 of 15 (13.3%) versus 29 of 85 (34%)]41; however, this may be due to excluding cases with biphenotypic disease by flow cytometry. The prognostic significance of double IGHV rearrangements has been an active area of investigation,42,45 and ERIC guidelines suggest that the prognosis is similar to standard single IGHV rearranged cases or driven by the productive rearrangement if discordant.17,19 One limitation of our study is that we did not include clinical parameters and outcomes in our analysis because the primary focus of the study was to address the technical aspects of NGS-based assessment of SHM status and compare methods. In addition, the relatively short follow-up interval and study design preclude evaluation of clinical implications.
In summary, a high-throughput NGS-based assessment of IGHV rearrangements can be used for the routine clinical evaluation of SHM status. Comparable results can be obtained using the FR1 primer set by PGM- and Illumina-based chemistries, including identical mutation rates. Leader-based analysis performed with Illumina sequencing shows comparable sensitivity to FR1, and analysis with both primer sets increased sensitivity. Although mutation rates are consistently slightly elevated when sequenced with FR1 primers, this has no impact of the overall classification in most cases except for borderline cases, which can be readily identified and should always be reflexed to leader testing. The clinical implementation results from the largest single-center study to date of patients with CLL evaluated prospectively by NGS-based IGHV mutation status of gDNA samples showed that double or multiple IGHV rearrangements can be detected at a higher frequency than by using traditional methods. Further analysis of samples with multiple rearrangements is needed and requires clear guidelines regarding their detection and evaluation to better assess their clinical impact.
Acknowledgments
We thank Hannah Kim and Maureen Flaherty (Hematopathology and Molecular Diagnostics Services, Department of Pathology, Memorial Sloan Kettering Cancer Center) and Marie-Josee and Henry R. Kravis (Center for Molecular Oncology, Memorial Sloan Kettering Cancer Center).
Footnotes
Supported in part through NIH National Cancer Institute Cancer Center Support grant P30 CA008748, NIH National Cancer Institute MSK Lymphoma SPORE grant P50 CA192937 (A.D.), and the Farmer Family Foundation (A.D.).
Disclosures: K.H., A.M.Z., and Y.H. are employees of Invivoscribe, Inc., San Diego, CA, which developed and sells the commercial assay used in this article. C.H. has received personal fees (not related to this study) from Blueprint Medicines, Hematopathology Advisory Board. C.M.V. has received personal fees (not related to this study) from DocDoc Pte. Ltd., and Paige.AI, Inc. M.E.A. has received personal fees (not related to this study) from Biocartis US, Inc. Invivoscribe, physician educational resources, Peerview Institute for Medical Education, Clinical Care Options, RMEI Medical Education, Janssen Global Services, Bristol-Myers Squibb, AstraZeneca, Roche, and Merck. M.D.E. has received personal fees (not related to this study) from serving on advisory boards of Loxo Oncology at Lilly and Acceleron Pharma (now acquired by Merck). A.D. has received personal fees from Physicians' Education Resource, Seattle Genetics, Takeda, Roche, EUSA Pharma, Peerview Institute, Corvus Pharmaceuticals, and AbbVie and research support from Roche and Takeda. A.M. has received personal fees (not related to this study) from TG Therapeutics, Pharmacyclics LLC, AbbVie, Adaptive Biotechnologies, Johnson and Johnson, Acerta, DTRM BioPharma, Nurix, AstraZeneca, BeiGene, Genentech Janssen, Loxo, Curio, Dava, Octopharma, Genmab, BMS, Medscape, PER, and PerView and research support from TG Therapeutics, Pharmacyclics, AbbVie, Johnson & Johnson, AstraZeneca, DTRM BioPharma, BeiGene, Genentech, Genmab Janssen, Loxo, Nurix, Octopharma, and Pfizer. L.E.R. has served as a consultant for AbbVie, Ascentage, AstraZeneca, Beigene, Janssen, Loxo Oncology, Pharmacyclics, Pfizer, and TG Therapeutics, served as a continuing medical education speaker for DAVA, Curio, and Medscape, holds minority ownership interest in Abbott Laboratories, received travel support from Loxo Oncology, and received research funding (paid to the institution) from Pfizer, Loxo Oncology, Aptose Biosciences, and Qilu Puget Sound Biotherapeutics. The remaining authors declare no relevant competing financial interests.
Current address of M.S., Children's Cancer Institute, Sydney, NSW, Australia; of K.H., Eclipse Bio, San Diego, CA; of C.H., Loxo Oncology at Lilly, Stamford, CT; of C.M. Providence Health and Services Regional Laboratory, Portland, OR.
Supplemental material for this article can be found at http://doi.org/10.1016/j.jmoldx.2023.02.005.
Author Contributions
K.P.-D. and M.E.A. designed the study, collected and analyzed the data, and wrote and revised the manuscript; M.S., W.Y., K.H., A.M.Z., Y.H., and M.K.-C. analyzed the data, and revised and approved the final manuscript; W.Y., L.M., Y.G.M., M.W., P.S., I.R., and T.B. performed the assays; W.Y., Y.H., L.M., H.K., C.H., C.M., J.Y., K.N., C.M.V., J.K.B., Y.L., M.Z., B.D., M.D.E., A.M., L.E.R., P.S., I.R., T.B., J.B., and M.R. provided study data and approved the final manuscript; and A.D. contributed to scientific discussions, and revised and approved the final manuscript.
Supplemental Data
Quality metrics of the validation set: line graph showing percentage of bases with a quality score of ≥30 (Q30 scores) in each sample evaluated on the MiSeq platform with framework region (FR) 1 and leader and percentage of bases with a quality score of ≥20 (Q20 scores) for PGM FR1 assessment. Because of the inherent differences in chemistries between the two sequencing platforms, only Q20 evaluation is shown for PGM. Overall, the following per-run quality control metrics were considered optimal for MiSeq: cluster density of 600 ± 107 K/mm2 and cluster passing filter percentage >80%. Mean Q30 scores of >70% and >75% were considered acceptable for MiSeq leader and FR1 sequencing, respectively. The quality metrics for PGM runs were based on PGM run report for Ion Sphere Particle summary: loading, >50%; enrichment, >50%; and clonal, >50%.
Validation set: discordant case (case 40) by PGM1 versus MiSeq framework region (FR) 1. Sequence alignment of the top two sequences by PGM output (cases 1 and 2, representing 65.4% and 12.8% of the total reads, respectively) to the output generated by the FR1-MiSeq assay shows that these are identical, with the exception of mismatches detected in the homopolymer regions. Furthermore, the leader primer set, which was only available on the MiSeq platform, supports detection of this clonal sequence. The output on the MiSeq platform shows that the available sequenced segment by FR1 is identical to that generated by leader, which is predicted to be productive.
Validation set: discordant case (case 48). Sequence alignment of the top productive sequence by leader (516 bp), framework region (FR) 1 MiSeq and PGM (266 bp), and reference method (Ref.) shows that FR1-generated sequences are identical to leader, whereas the sequence by the reference method is similar with a few small insertions and several point mutations, accounting for the higher somatic hypermutation (SHM) rate. The difference in SHM rate as generated by FR1 (MiSeq or PGM, 1.76%) compared with leader (2.36%) is likely due to capture of a larger portion of the rearranged IGHV gene, which includes the entire FR1 region and leads to a higher calculated mutation rate. A second related productive clonal sequence with a 41-bp deletion that may represent clonal drift (Figure 1A, alignment not shown) was also identified in this sample by all four methods and is best detected by leader, whereas FR1 primers amplify downstream of the deleted region, making it harder to distinguish it as a separate clonotype from the rearranged sequence shown here.
Validation set: comparison of mutation rates by the framework region (FR) 1 primer set sequenced on the PGM and MiSeq platforms sets for all 48 matching clonal sequences detected. A: Scatterplot showing the correlation and linear regression analysis of mutation rates (r = 1; 95% CI, 1 to 1; R2 = 1; P < 0.0001). The best fit linear regression line (black line) overlaps with the 95% CIs (red dashed lines). B: Bland-Altman plot showing a difference versus mean analysis of the mutation rates identified by the two platforms (PGM versus MiSeq), showing no differences [mean ± SD, 0.0 ± 0.0; 95% limits of agreement, 0 to 0 (red dotted lines)].
Validation cohort: distribution of mean mutation frequencies along the IGHV gene regions in all productive matching rearrangements with <100% identity compared with germline (19 clonotypes). The mean mutation rates per framework region (FR) and complementary-determining regions (CDRs) were analyzed in 19 productive clonal sequences with any degree of mutations (>0%, nongermline configuration) generated by the leader primer set. The mutation rate per region was calculated as a percentage of the mismatched bases per length of each specific region. The mean mutation rates per region are shown in the bar graph, and the ranges for each are as follow: FR1, 0% to 8%; CDR1, 0% to 21%; FR2, 0% to 8%; CDR2, 0% to 42%; FR3, 2% to 13%; and CDR3, 0% to 14%.
Comparison of mutation rates when including and in silico removing the framework region (FR) 1 from 39 clonal productive sequences generated by the leader primer set. A: Scatterplot showing the correlation and linear regression analysis of mutation rates (r = 0.9867; 95% CI, 0.975 to 0.993; R2 = 0.974; P < 0.0001). The best fit linear regression line (black line) and 95% CIs (red dashed lines) are shown. B: Bland-Altman plot showing a difference versus mean analysis of the mutation rates identified by the two methods (leader versus FR1 removed from leader) with a mean ± SD bias of 0.480 ± 0.727 (black line) and 95% limits of agreement of −0.945 to 1.90 (red dotted lines). The green line represents a difference between the assays of zero. C: Histogram showing frequency distribution of mutation rates grouped by bins: 0%, 0.1% to 1.4%, 1.5% to 1.9%, 2.0% to 3.0%, and >3.1%. The two sets of bars correspond to results obtained by leader (black bars) and removal of FR1 from leader generated sequences (gray bars), which are identical.
Clinical cohort sequenced with leader: Distribution of mean mutation frequencies along the IGHV gene regions in all productive rearrangements with <100% identity compared with germline. The mean mutation rates per framework region (FR) and complementary-determining regions (CDRs) were analyzed from the productive clonal sequences with any degree of mutation (>0%, nongermline configuration) sequenced by leader (135 of 228 productive sequences with a mutation rate >0%). The mutation rate per region was calculated as a percentage of the mismatched bases per length of each specific region. The mean mutation rates per region are shown in the bar graph, and the ranges are as follow: FR1, 0% to 12%; CDR1, 0% to 33%; FR2, 0% to 12%; CDR2, 0% to 33%; FR3, 0% to 17%; and CDR3, 0% to 29%.
Clinical implementation: comparison of mutation rates by framework region (FR) 1 and leader primer sets for productive clonal sequences identified by both primer sets. A: Scatterplot showing the correlation and linear regression analysis of mutation rates (r = 0.9933; 95% CI, 0.9603 to 0.9930; R2 = 0.9669; P < 0.0001). The best fit linear regression line (black line) and 95% CIs (red dashed lines) are shown. B: Bland-Altman plot showing a difference versus mean analysis of the mutation rates identified by the two primer sets (FR1 versus leader) with a mean ± SD bias of 0.3104 ± 0.3702, and 95% limits of agreement from −0.42 to 1.04 (red dotted lines). The green line represents a difference between the assays of zero. C: Histogram showing frequency distribution of mutation rates grouped by bins: 0%, 0.1% to 1.4%, 1.5% to 1.9%, 2.0% to 3.0%, and >3.1%. The two sets of bars correspond to results obtained by leader (black bars) and FR1 (gray bars) primer sets.
A total of 435 clonal productive rearrangements were identified among the clinical chronic lymphocytic leukemia/small lymphocytic lymphoma (CLL) cohort sequenced on the MiSeq with leader and framework region (FR) 1 primers. A: Circos plot showing the IGHV and IGHJ gene families and their combinations detected among this cohort. B: Pie-of-pie chart showing distribution of the major IGH complementary-determining region (CDR) 3 stereotyped subsets among the productive clonal sequences. The pie chart on the left shows the proportion of sequences that are unassigned (heterogenous), whereas the pie chart on the right shows the distribution of stereotyped sequences among the 19 major subsets. Major subsets 12, 16, 77, 99, and 202 were not represented in our cohort and are not shown.
References
- 1.Swerdlow S.H., Campo E., Harris N.L., Jaffe E.S., Pileri S.A., Stein H., Thiele J., Arber D.A., Hasserjian R.P., Le Beau M.M., Orazi A., Siebert R. Lyon International Agency for Research on Cancer.; 2017. WHO Classification of Tumours of the Haematopoietic and Lymphoid Tissues. [Google Scholar]
- 2.Hamblin T.J., Davis Z., Gardiner A., Oscier D.G., Stevenson F.K. Unmutated Ig V(H) genes are associated with a more aggressive form of chronic lymphocytic leukemia. Blood. 1999;94:1848–1854. [PubMed] [Google Scholar]
- 3.Damle R.N., Wasil T., Fais F., Ghiotto F., Valetto A., Allen S.L., Buchbinder A., Budman D., Dittmar K., Kolitz J., Lichtman S.M., Schulman P., Vinciguerra V.P., Rai K.R., Ferrarini M., Chiorazzi N. Ig V gene mutation status and CD38 expression as novel prognostic indicators in chronic lymphocytic leukemia. Blood. 1999;94:1840–1847. [PubMed] [Google Scholar]
- 4.Parikh S.A., Strati P., Tsang M., West C.P., Shanafelt T.D. Should IGHV status and FISH testing be performed in all CLL patients at diagnosis? a systematic review and meta-analysis. Blood. 2016;127:1752–1760. doi: 10.1182/blood-2015-10-620864. [DOI] [PubMed] [Google Scholar]
- 5.Thompson P.A., Tam C.S., O'Brien S.M., Wierda W.G., Stingo F., Plunkett W., Smith S.C., Kantarjian H.M., Freireich E.J., Keating M.J. Fludarabine, cyclophosphamide, and rituximab treatment achieves long-term disease-free survival in IGHV-mutated chronic lymphocytic leukemia. Blood. 2016;127:303–309. doi: 10.1182/blood-2015-09-667675. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Fischer K., Bahlo J., Fink A.M., Goede V., Herling C.D., Cramer P., Langerbeins P., von Tresckow J., Engelke A., Maurer C., Kovacs G., Herling M., Tausch E., Kreuzer K.A., Eichhorst B., Bottcher S., Seymour J.F., Ghia P., Marlton P., Kneba M., Wendtner C.M., Dohner H., Stilgenbauer S., Hallek M. Long-term remissions after FCR chemoimmunotherapy in previously untreated patients with CLL: updated results of the CLL8 trial. Blood. 2016;127:208–215. doi: 10.1182/blood-2015-06-651125. [DOI] [PubMed] [Google Scholar]
- 7.Byrd J.C., Furman R.R., Coutre S.E., Burger J.A., Blum K.A., Coleman M., Wierda W.G., Jones J.A., Zhao W., Heerema N.A., Johnson A.J., Shaw Y., Bilotti E., Zhou C., James D.F., O'Brien S. Three-year follow-up of treatment-naive and previously treated patients with CLL and SLL receiving single-agent ibrutinib. Blood. 2015;125:2497–2506. doi: 10.1182/blood-2014-10-606038. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Guo A., Lu P., Galanina N., Nabhan C., Smith S.M., Coleman M., Wang Y.L. Heightened BTK-dependent cell proliferation in unmutated chronic lymphocytic leukemia confers increased sensitivity to ibrutinib. Oncotarget. 2016;7:4598–4610. doi: 10.18632/oncotarget.6727. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.O'Brien S., Furman R.R., Coutre S., Flinn I.W., Burger J.A., Blum K., Sharman J., Wierda W., Jones J., Zhao W., Heerema N.A., Johnson A.J., Luan Y., James D.F., Chu A.D., Byrd J.C. Single-agent ibrutinib in treatment-naïve and relapsed/refractory chronic lymphocytic leukemia: a 5-year experience. Blood. 2018;131:1910–1919. doi: 10.1182/blood-2017-10-810044. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Arnold A., Cossman J., Bakhshi A., Jaffe E.S., Waldmann T.A., Korsmeyer S.J. Immunoglobulin-gene rearrangements as unique clonal markers in human lymphoid neoplasms. N Engl J Med. 1983;309:1593–1599. doi: 10.1056/NEJM198312293092601. [DOI] [PubMed] [Google Scholar]
- 11.Dudley D.D., Chaudhuri J., Bassing C.H., Alt F.W. Mechanism and control of V(D)J recombination versus class switch recombination: similarities and differences. Adv Immunol. 2005;86:43–112. doi: 10.1016/S0065-2776(04)86002-4. [DOI] [PubMed] [Google Scholar]
- 12.Sela-Culang I., Kunik V., Ofran Y. The structural basis of antibody-antigen recognition. Front Immunol. 2013;4:302. doi: 10.3389/fimmu.2013.00302. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Sahota S.S., Leo R., Hamblin T.J., Stevenson F.K. Ig VH gene mutational patterns indicate different tumor cell status in human myeloma and monoclonal gammopathy of undetermined significance. Blood. 1996;87:746–755. [PubMed] [Google Scholar]
- 14.Saini J., Hershberg U. B cell variable genes have evolved their codon usage to focus the targeted patterns of somatic mutation on the complementarity determining regions. Mol Immunol. 2015;65:157–167. doi: 10.1016/j.molimm.2015.01.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.van Dongen J.J., Langerak A.W., Bruggemann M., Evans P.A., Hummel M., Lavender F.L., Delabesse E., Davi F., Schuuring E., Garcia-Sanz R., van Krieken J.H., Droese J., Gonzalez D., Bastard C., White H.E., Spaargaren M., Gonzalez M., Parreira A., Smith J.L., Morgan G.J., Kneba M., Macintyre E.A. Design and standardization of PCR primers and protocols for detection of clonal immunoglobulin and T-cell receptor gene recombinations in suspect lymphoproliferations: report of the BIOMED-2 Concerted Action BMH4-CT98-3936. Leukemia. 2003;17:2257–2317. doi: 10.1038/sj.leu.2403202. [DOI] [PubMed] [Google Scholar]
- 16.Fais F., Ghiotto F., Hashimoto S., Sellars B., Valetto A., Allen S.L., Schulman P., Vinciguerra V.P., Rai K., Rassenti L.Z., Kipps T.J., Dighiero G., Schroeder H.W., Jr., Ferrarini M., Chiorazzi N. Chronic lymphocytic leukemia B cells express restricted sets of mutated and unmutated antigen receptors. J Clin Invest. 1998;102:1515–1525. doi: 10.1172/JCI3009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Rosenquist R., Ghia P., Hadzidimitriou A., Sutton L.A., Agathangelidis A., Baliakas P., Darzentas N., Giudicelli V., Lefranc M.P., Langerak A.W., Belessi C., Davi F., Stamatopoulos K. Immunoglobulin gene sequence analysis in chronic lymphocytic leukemia: updated ERIC recommendations. Leukemia. 2017;31:1477–1481. doi: 10.1038/leu.2017.125. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Ghia P., Stamatopoulos K., Belessi C., Moreno C., Stilgenbauer S., Stevenson F., Davi F., Rosenquist R. ERIC recommendations on IGHV gene mutational status analysis in chronic lymphocytic leukemia. Leukemia. 2007;21:1–3. doi: 10.1038/sj.leu.2404457. [DOI] [PubMed] [Google Scholar]
- 19.Agathangelidis A., Chatzidimitriou A., Chatzikonstantinou T., Tresoldi C., Davis Z., Giudicelli V., Kossida S., Belessi C., Rosenquist R., Ghia P., Langerak A.W., Davi F., Stamatopoulos K., on behalf of Eric tERIoCLL Immunoglobulin gene sequence analysis in chronic lymphocytic leukemia: the 2022 update of the recommendations by ERIC, the European Research Initiative on CLL. Leukemia. 2022;36:1961–1968. doi: 10.1038/s41375-022-01604-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Agathangelidis A., Darzentas N., Hadzidimitriou A., Brochet X., Murray F., Yan X.J., etal Stereotyped B-cell receptors in one-third of chronic lymphocytic leukemia: a molecular classification with implications for targeted therapies. Blood. 2012;119:4467–4475. doi: 10.1182/blood-2011-11-393694. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Stamatopoulos K., Belessi C., Moreno C., Boudjograh M., Guida G., Smilevska T., Belhoul L., Stella S., Stavroyianni N., Crespo M., Hadzidimitriou A., Sutton L., Bosch F., Laoutaris N., Anagnostopoulos A., Montserrat E., Fassas A., Dighiero G., Caligaris-Cappio F., Merle-Beral H., Ghia P., Davi F. Over 20% of patients with chronic lymphocytic leukemia carry stereotyped receptors: pathogenetic implications and clinical correlations. Blood. 2007;109:259–270. doi: 10.1182/blood-2006-03-012948. [DOI] [PubMed] [Google Scholar]
- 22.Baliakas P., Hadzidimitriou A., Sutton L.A., Minga E., Agathangelidis A., Nichelatti M., etal Clinical effect of stereotyped B-cell receptor immunoglobulins in chronic lymphocytic leukaemia: a retrospective multicentre study. Lancet Haematol. 2014;1:e74–84. doi: 10.1016/S2352-3026(14)00005-2. [DOI] [PubMed] [Google Scholar]
- 23.Stamatopoulos K., Agathangelidis A., Rosenquist R., Ghia P. Antigen receptor stereotypy in chronic lymphocytic leukemia. Leukemia. 2017;31:282–291. doi: 10.1038/leu.2016.322. [DOI] [PubMed] [Google Scholar]
- 24.Agathangelidis A., Chatzidimitriou A., Gemenetzi K., Giudicelli V., Karypidou M., Plevova K., et al. Higher-order connections between stereotyped subsets: implications for improved patient classification in CLL. Blood. 2021;137:1365–1376. doi: 10.1182/blood.2020007039. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Arcila M.E., Yu W., Syed M., Kim H., Maciag L., Yao J., Ho C., Petrova K., Moung C., Salazar P., Rijo I., Baldi T., Zehir A., Landgren O., Park J., Roshal M., Dogan A., Nafa K. Establishment of immunoglobulin heavy (IGH) chain clonality testing by next-generation sequencing for routine characterization of B-cell and plasma cell neoplasms. J Mol Diagn. 2019;21:330–342. doi: 10.1016/j.jmoldx.2018.10.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Syed M.H.N.K., Baldi T., Zehir A., Cheng D.T., Ladanyi M., Arcila M.E. MSK-LYMPHOCONE: data analysis pipeline and tools for immune repertoire analysis. J Mol Diagn. 2015;17:804a. [Google Scholar]
- 27.Langerak A.W., Groenen P.J., Bruggemann M., Beldjord K., Bellan C., Bonello L., Boone E., Carter G.I., Catherwood M., Davi F., Delfau-Larue M.H., Diss T., Evans P.A., Gameiro P., Garcia Sanz R., Gonzalez D., Grand D., Hakansson A., Hummel M., Liu H., Lombardia L., Macintyre E.A., Milner B.J., Montes-Moreno S., Schuuring E., Spaargaren M., Hodges E., van Dongen J.J. EuroClonality/BIOMED-2 guidelines for interpretation and reporting of Ig/TCR clonality testing in suspected lymphoproliferations. Leukemia. 2012;26:2159–2171. doi: 10.1038/leu.2012.246. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Brochet X., Lefranc M.P., Giudicelli V. IMGT/V-QUEST: the highly customized and integrated system for IG and TR standardized V-J and V-D-J sequence analysis. Nucleic Acids Res. 2008;36:W503–508. doi: 10.1093/nar/gkn316. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Ye J., Ma N., Madden T.L., Ostell J.M. IgBLAST: an immunoglobulin variable domain sequence analysis tool. Nucleic Acids Res. 2013;41:W34–40. doi: 10.1093/nar/gkt382. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Bystry V., Agathangelidis A., Bikos V., Sutton L.A., Baliakas P., Hadzidimitriou A., Stamatopoulos K., Darzentas N. ARResT/AssignSubsets: a novel application for robust subclassification of chronic lymphocytic leukemia based on B cell receptor IG stereotypy. Bioinformatics. 2015;31:3844–3846. doi: 10.1093/bioinformatics/btv456. [DOI] [PubMed] [Google Scholar]
- 31.Gu Z., Gu L., Eils R., Schlesner M., Brors B. Circlize implements and enhances circular visualization in R. Bioinformatics. 2014;30:2811–2812. doi: 10.1093/bioinformatics/btu393. [DOI] [PubMed] [Google Scholar]
- 32.Scheijen B., Meijers R.W.J., Rijntjes J., van der Klift M.Y., Möbs M., Steinhilber J., Reigl T., van den Brand M., Kotrová M., Ritter J.M., Catherwood M.A., Stamatopoulos K., Brüggemann M., Davi F., Darzentas N., Pott C., Fend F., Hummel M., Langerak A.W., Groenen P. Next-generation sequencing of immunoglobulin gene rearrangements for clonality assessment: a technical feasibility study by EuroClonality-NGS. Leukemia. 2019;33:2227–2240. doi: 10.1038/s41375-019-0508-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.McClure R., Mai M., McClure S. High-throughput sequencing using the Ion Torrent personal genome machine for clinical evaluation of somatic hypermutation status in chronic lymphocytic leukemia. J Mol Diagn. 2015;17:145–154. doi: 10.1016/j.jmoldx.2014.11.006. [DOI] [PubMed] [Google Scholar]
- 34.Davi F., Langerak A.W., de Septenville A.L., Kolijn P.M., Hengeveld P.J., Chatzidimitriou A., Bonfiglio S., Sutton L.A., Rosenquist R., Ghia P., Stamatopoulos K. Immunoglobulin gene analysis in chronic lymphocytic leukemia in the era of next generation sequencing. Leukemia. 2020;34:2545–2551. doi: 10.1038/s41375-020-0923-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Quail M.A., Smith M., Coupland P., Otto T.D., Harris S.R., Connor T.R., Bertoni A., Swerdlow H.P., Gu Y. A tale of three next generation sequencing platforms: comparison of Ion Torrent, Pacific Biosciences and Illumina MiSeq sequencers. BMC Genomics. 2012;13:341. doi: 10.1186/1471-2164-13-341. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Rene C., Prat N., Thuizat A., Broctawik M., Avinens O., Eliaou J.F. Comprehensive characterization of immunoglobulin gene rearrangements in patients with chronic lymphocytic leukaemia. J Cell Mol Med. 2014;18:979–990. doi: 10.1111/jcmm.12215. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Dörner T., Foster S.J., Farner N.L., Lipsky P.E. Somatic hypermutation of human immunoglobulin heavy chain genes: targeting of RGYW motifs on both DNA strands. Eur J Immunol. 1998;28:3384–3396. doi: 10.1002/(SICI)1521-4141(199810)28:10<3384::AID-IMMU3384>3.0.CO;2-T. [DOI] [PubMed] [Google Scholar]
- 38.Neuberger M.S., Ehrenstein M.R., Klix N., Jolly C.J., Yélamos J., Rada C., Milstein C. Monitoring and interpreting the intrinsic features of somatic hypermutation. Immunol Rev. 1998;162:107–116. doi: 10.1111/j.1600-065x.1998.tb01434.x. [DOI] [PubMed] [Google Scholar]
- 39.Di Noia J.M., Neuberger M.S. Molecular mechanisms of antibody somatic hypermutation. Annu Rev Biochem. 2007;76:1–22. doi: 10.1146/annurev.biochem.76.061705.090740. [DOI] [PubMed] [Google Scholar]
- 40.Heyman B., Volkheimer A.D., Weinberg J.B. Double IGHV DNA gene rearrangements in CLL: influence of mixed-mutated and -unmutated rearrangements on outcomes in CLL. Blood Cancer J. 2016;6:e440. doi: 10.1038/bcj.2016.49. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Langerak A.W., Davi F., Ghia P., Hadzidimitriou A., Murray F., Potter K.N., Rosenquist R., Stamatopoulos K., Belessi C. Immunoglobulin sequence analysis and prognostication in CLL: guidelines from the ERIC review board for reliable interpretation of problematic cases. Leukemia. 2011;25:979–984. doi: 10.1038/leu.2011.49. [DOI] [PubMed] [Google Scholar]
- 42.Visco C., Moretta F., Falisi E., Facco M., Maura F., Novella E., Nichele I., Finotto S., Giaretta I., Ave E., Perbellini O., Guercini N., Scupoli M.T., Trentin L., Trimarco V., Neri A., Semenzato G., Rodeghiero F., Pizzolo G., Ambrosetti A. Double productive immunoglobulin sequence rearrangements in patients with chronic lymphocytic leukemia. Am J Hematol. 2013;88:277–282. doi: 10.1002/ajh.23396. [DOI] [PubMed] [Google Scholar]
- 43.Kern W., Bacher U., Schnittger S., Dicker F., Alpermann T., Haferlach T., Haferlach C. Flow cytometric identification of 76 patients with biclonal disease among 5523 patients with chronic lymphocytic leukaemia (B-CLL) and its genetic characterization. Br J Haematol. 2014;164:565–569. doi: 10.1111/bjh.12652. [DOI] [PubMed] [Google Scholar]
- 44.Kriangkum J., Motz S.N., Mack T., Beiggi S., Baigorri E., Kuppusamy H., Belch A.R., Johnston J.B., Pilarski L.M. Single-cell analysis and next-generation immuno-sequencing show that multiple clones persist in patients with chronic lymphocytic leukemia. PLoS One. 2015;10:e0137232. doi: 10.1371/journal.pone.0137232. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Stamatopoulos B., Timbs A., Bruce D., Smith T., Clifford R., Robbe P., Burns A., Vavoulis D.V., Lopez L., Antoniou P., Mason J., Dreau H., Schuh A. Targeted deep sequencing reveals clinically relevant subclonal IgHV rearrangements in chronic lymphocytic leukemia. Leukemia. 2017;31:837–845. doi: 10.1038/leu.2016.307. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Quality metrics of the validation set: line graph showing percentage of bases with a quality score of ≥30 (Q30 scores) in each sample evaluated on the MiSeq platform with framework region (FR) 1 and leader and percentage of bases with a quality score of ≥20 (Q20 scores) for PGM FR1 assessment. Because of the inherent differences in chemistries between the two sequencing platforms, only Q20 evaluation is shown for PGM. Overall, the following per-run quality control metrics were considered optimal for MiSeq: cluster density of 600 ± 107 K/mm2 and cluster passing filter percentage >80%. Mean Q30 scores of >70% and >75% were considered acceptable for MiSeq leader and FR1 sequencing, respectively. The quality metrics for PGM runs were based on PGM run report for Ion Sphere Particle summary: loading, >50%; enrichment, >50%; and clonal, >50%.
Validation set: discordant case (case 40) by PGM1 versus MiSeq framework region (FR) 1. Sequence alignment of the top two sequences by PGM output (cases 1 and 2, representing 65.4% and 12.8% of the total reads, respectively) to the output generated by the FR1-MiSeq assay shows that these are identical, with the exception of mismatches detected in the homopolymer regions. Furthermore, the leader primer set, which was only available on the MiSeq platform, supports detection of this clonal sequence. The output on the MiSeq platform shows that the available sequenced segment by FR1 is identical to that generated by leader, which is predicted to be productive.
Validation set: discordant case (case 48). Sequence alignment of the top productive sequence by leader (516 bp), framework region (FR) 1 MiSeq and PGM (266 bp), and reference method (Ref.) shows that FR1-generated sequences are identical to leader, whereas the sequence by the reference method is similar with a few small insertions and several point mutations, accounting for the higher somatic hypermutation (SHM) rate. The difference in SHM rate as generated by FR1 (MiSeq or PGM, 1.76%) compared with leader (2.36%) is likely due to capture of a larger portion of the rearranged IGHV gene, which includes the entire FR1 region and leads to a higher calculated mutation rate. A second related productive clonal sequence with a 41-bp deletion that may represent clonal drift (Figure 1A, alignment not shown) was also identified in this sample by all four methods and is best detected by leader, whereas FR1 primers amplify downstream of the deleted region, making it harder to distinguish it as a separate clonotype from the rearranged sequence shown here.
Validation set: comparison of mutation rates by the framework region (FR) 1 primer set sequenced on the PGM and MiSeq platforms sets for all 48 matching clonal sequences detected. A: Scatterplot showing the correlation and linear regression analysis of mutation rates (r = 1; 95% CI, 1 to 1; R2 = 1; P < 0.0001). The best fit linear regression line (black line) overlaps with the 95% CIs (red dashed lines). B: Bland-Altman plot showing a difference versus mean analysis of the mutation rates identified by the two platforms (PGM versus MiSeq), showing no differences [mean ± SD, 0.0 ± 0.0; 95% limits of agreement, 0 to 0 (red dotted lines)].
Validation cohort: distribution of mean mutation frequencies along the IGHV gene regions in all productive matching rearrangements with <100% identity compared with germline (19 clonotypes). The mean mutation rates per framework region (FR) and complementary-determining regions (CDRs) were analyzed in 19 productive clonal sequences with any degree of mutations (>0%, nongermline configuration) generated by the leader primer set. The mutation rate per region was calculated as a percentage of the mismatched bases per length of each specific region. The mean mutation rates per region are shown in the bar graph, and the ranges for each are as follow: FR1, 0% to 8%; CDR1, 0% to 21%; FR2, 0% to 8%; CDR2, 0% to 42%; FR3, 2% to 13%; and CDR3, 0% to 14%.
Comparison of mutation rates when including and in silico removing the framework region (FR) 1 from 39 clonal productive sequences generated by the leader primer set. A: Scatterplot showing the correlation and linear regression analysis of mutation rates (r = 0.9867; 95% CI, 0.975 to 0.993; R2 = 0.974; P < 0.0001). The best fit linear regression line (black line) and 95% CIs (red dashed lines) are shown. B: Bland-Altman plot showing a difference versus mean analysis of the mutation rates identified by the two methods (leader versus FR1 removed from leader) with a mean ± SD bias of 0.480 ± 0.727 (black line) and 95% limits of agreement of −0.945 to 1.90 (red dotted lines). The green line represents a difference between the assays of zero. C: Histogram showing frequency distribution of mutation rates grouped by bins: 0%, 0.1% to 1.4%, 1.5% to 1.9%, 2.0% to 3.0%, and >3.1%. The two sets of bars correspond to results obtained by leader (black bars) and removal of FR1 from leader generated sequences (gray bars), which are identical.
Clinical cohort sequenced with leader: Distribution of mean mutation frequencies along the IGHV gene regions in all productive rearrangements with <100% identity compared with germline. The mean mutation rates per framework region (FR) and complementary-determining regions (CDRs) were analyzed from the productive clonal sequences with any degree of mutation (>0%, nongermline configuration) sequenced by leader (135 of 228 productive sequences with a mutation rate >0%). The mutation rate per region was calculated as a percentage of the mismatched bases per length of each specific region. The mean mutation rates per region are shown in the bar graph, and the ranges are as follow: FR1, 0% to 12%; CDR1, 0% to 33%; FR2, 0% to 12%; CDR2, 0% to 33%; FR3, 0% to 17%; and CDR3, 0% to 29%.
Clinical implementation: comparison of mutation rates by framework region (FR) 1 and leader primer sets for productive clonal sequences identified by both primer sets. A: Scatterplot showing the correlation and linear regression analysis of mutation rates (r = 0.9933; 95% CI, 0.9603 to 0.9930; R2 = 0.9669; P < 0.0001). The best fit linear regression line (black line) and 95% CIs (red dashed lines) are shown. B: Bland-Altman plot showing a difference versus mean analysis of the mutation rates identified by the two primer sets (FR1 versus leader) with a mean ± SD bias of 0.3104 ± 0.3702, and 95% limits of agreement from −0.42 to 1.04 (red dotted lines). The green line represents a difference between the assays of zero. C: Histogram showing frequency distribution of mutation rates grouped by bins: 0%, 0.1% to 1.4%, 1.5% to 1.9%, 2.0% to 3.0%, and >3.1%. The two sets of bars correspond to results obtained by leader (black bars) and FR1 (gray bars) primer sets.
A total of 435 clonal productive rearrangements were identified among the clinical chronic lymphocytic leukemia/small lymphocytic lymphoma (CLL) cohort sequenced on the MiSeq with leader and framework region (FR) 1 primers. A: Circos plot showing the IGHV and IGHJ gene families and their combinations detected among this cohort. B: Pie-of-pie chart showing distribution of the major IGH complementary-determining region (CDR) 3 stereotyped subsets among the productive clonal sequences. The pie chart on the left shows the proportion of sequences that are unassigned (heterogenous), whereas the pie chart on the right shows the distribution of stereotyped sequences among the 19 major subsets. Major subsets 12, 16, 77, 99, and 202 were not represented in our cohort and are not shown.




