Characterization of prediagnostic and diagnostic samples. (A) IGHV mutational status determination of all samples from patients with future CLL with a dominant clonotype frequency >2%. If no clonotype > 2% dominant clonotype frequency was present in the earliest sample of a patient, dominant clonotypes from any later measurements were used. Mutated clonotypes are shown as red circles, whereas unmutated clonotypes (IGHV mutational status >98% germline sequence identity) are shown as blue triangles (upper panel). A Kaplan-Meier (survival) curve for time until CLL diagnosis stratified by IGHV mutational status, indicating that the log-rank test does not find any significant difference (lower panel). (B) Overview of all CLL subsets identified in the data; major subsets were divided by subsets associated with aggressive or indolent disease course. Minor subsets are shown separately in light green. Samples in which the K16 and YDSD motifs and the R110 mutation of light chain subset 2L were confirmed are shown as purple triangles. Patients with repeated samples are connected by a black line (upper panel). A Kaplan-Meier (survival) curve for the time to CLL diagnosis from prediagnostic sample collection stratified by BcR stereotyped CLL subsets, indicating that the log-rank test does not find any significant difference (lower panel). (C) Cumulative frequency of clonotypes detected at CLL diagnosis for the prediagnostic sample(s) and diagnostic sample. Cumulative indicates that if >1 clonotype was shared in the prediagnostic sample and diagnostic sample [eg, patient 1 and 2 in panel D], their frequency was summed up. For patients for whom no diagnostic sample was available, the most skewed clonotype in the sample closest to diagnosis was traced instead. (D) The distribution of the BcR IGH gene repertoire of 3 patients with multiple clonotypes that underwent significant shifts over time to diagnosis. Dominant clonotype at CLL diagnosis is shown in red for each sample, with any secondary clonotype at diagnosis shown in blue. All small unrelated clonotypes (frequency <5%) were summed up and shown as background in light green. For more information on all patients with diagnostic material, see supplemental Tables 1 and 3.