Abstract
With recent rapid advances in genomic technologies, precise delineation of structural chromosome rearrangements at the nucleotide level is becoming increasingly feasible. In this era of “next-generation cytogenetics” (i.e., an integration of traditional cytogenetic techniques and next-generation sequencing), a consensus nomenclature is essential for accurate communication and data sharing. Currently, nomenclature for describing the sequencing data of these aberrations is lacking. Herein, we present a system called Next-Gen Cytogenetic Nomenclature, which is concordant with the International System for Human Cytogenetic Nomenclature (2013). This system starts with the alignment of rearrangement sequences by BLAT or BLAST (alignment tools) and arrives at a concise and detailed description of chromosomal changes. To facilitate usage and implementation of this nomenclature, we are developing a program designated BLA(S)T Output Sequence Tool of Nomenclature (BOSToN), a demonstrative version of which is accessible online. A standardized characterization of structural chromosomal rearrangements is essential both for research analyses and for application in the clinical setting.
Introduction
It has been over half a century since the human chromosome number was reported to be 46 and not 48,1 an event that vitalized interest in human cytogenetics and subsequently required an international system of nomenclature to be developed. Therefore, a study group at the “Denver Conference” (1960) proposed a system that became the foundation of human cytogenetic nomenclature.2 Soon thereafter, human chromosome bands were identified,3,4 and a meeting held in Paris in 1971 yielded a significant document in the annals of human cytogenetics by recommending a classification system for chromosome bands and by providing nomenclature for structural chromosome rearrangements.5 In 1978, previous international conference reports were reviewed and combined into a document titled “An International System for Human Cytogenetic Nomenclature (1978),” abbreviated as ISCN 1978 by the International Standing Committee on Human Cytogenetic Nomenclature.6
Rapid growth in the knowledge of constitutional and acquired chromosome aberrations after improvements in cytogenetic techniques (including high-resolution banding, in situ hybridization, and microarrays) has necessitated revisions to the ISCN, published subsequently in 1981,7 1985,8 1991,9 1995,10 2005,11 2009,12 and most recently in 2013.13 ISCN 2013 provides instruction on a variety of issues arising from new molecular genetic methodologies (e.g., chromothripsis and duplication, using a genome build when describing microarray results, nomenclature for targeted quantitative assays). With the exception of microarray nomenclature, which can describe unbalanced copy-number variations (CNVs), nomenclature for structural chromosome rearrangements is limited to the description of chromosome-band levels and in situ hybridization probes. Therefore, a consensus system for the description of chromosomal abnormalities at the level of nucleotide resolution achievable routinely by next-generation sequencing methods has yet to be addressed.
Sequencing breakpoints of structural chromosome rearrangements has been possible since the 1980s.14,15 However, timely localization of these aberrations to the nucleotide level in a genome-wide context became feasible only with recent improvements in massively parallel sequencing technologies, yet it promises to revolutionize the field of cytogenetics.16–19 Historical methods that have been used in cytogenetics are unable to resolve structural rearrangements to the nucleotide level, which makes it difficult to analyze the actual pathological burden. Specific identification of disrupted genomic region(s) is critical in diagnosis and management of constitutional and acquired rearrangements, especially as annotation of the human genome accelerates.20–31 The ability to map breakpoints precisely is the foundation of the Developmental Genome Anatomy Project (DGAP), through which more than 100 subjects with apparently balanced chromosome rearrangements and congenital disorders have been sequenced, revealing a wide variety of genes disrupted and dysregulated in human development. Throughout these DGAP studies, it has become increasingly clear that additional nomenclature guidance is needed for accurate and full description of these chromosome aberrations. Currently, most reports of sequencing results provide nucleotide numbers of the breakpoints in various formats based upon the reference-genome sequence alignment.20–29 However, additional important characteristics of the rearrangement—including reference-genome identification, chromosome-band level, direction of the sequence, homology, repeats, and nontemplated sequence—are not typically described.
Terminology that can be understood universally among scientists and clinicians with optimal description will enhance communication of sequencing data of such rearrangements. Thus, we propose Next-Gen Cytogenetic Nomenclature, a set of recommendations elucidated from sequencing results and analogous with ISCN 2013 for structural chromosome rearrangements, and use examples largely derived from previously published DGAP cases. These recommendations can potentially address all of the above-mentioned characteristics of structural rearrangements in a systematic approach.
Material and Methods
The UCSC Genome Browser32 is used for aligning rearrangement sequences obtained through capillary sequencing confirmation of constitutional and neoplastic chromosome aberrations after next-generation sequencing.19 This website provides BLAST (Basic Local Alignment Search Tool33)-Like Alignment Tool (BLAT) for sequence alignment.34 Using this tool for manual interpretation of the rearrangements has several advantages, including being faster than BLAST and having a direct link into the UCSC Genome Browser.35 Screenshots from the website are provided according to the website guidelines titled “Citing the UCSC Browser in a Publication or Web Page” (Figures 2, S2–S4, S6–S8, S10–S12, S14, S15, S17–S20, S22–S25, S28–S32, S36–S39, S41–S43, S47, S50, and S53, available online).When applicable, sample cases herein are referred to by their DGAP identification numbers.
Figure 2.
BLAT Alignment of a Rearrangement Sequence from a Hypothetical Case with a Simple Translocation of t(1;8)(q23.3;q12.2)
There are four breakpoints and two rearrangements (left). Rearrangement_A (on der(1)) is submitted to “BLAT Search Genome,” and accurate alignments are chosen as shown in the flowchart (right). The same process is then applied to Rearrangement_B (data not shown).
To facilitate explanations throughout this document, we define the terms breakpoint and rearrangement sequence below.
Breakpoint: the nucleotides on each side of the break (Figure 1, left) and reunion (Figure 1, middle).
Rearrangement sequence: the sequence that encompasses rearranged breakpoints and, therefore, the breakage and reunion (::) (Figure 1, right). In other words, this sequence is the junction of DNA sequences that are noncontiguous in the reference sequence and that occur as a result of the structural alteration of the chromosome(s). As described in ISCN 2013, structurally altered chromosomes may be referred to as derivative (der) chromosomes, among other terms and symbols.
Figure 1.
Schematic Diagrams of Simple Chromosomal Aberrations Show the Number of Breakpoints and Rearrangements
(A and B) Four breakpoints on each side of the break (left) and reunion (middle) and two rearrangements (right) in a simple translocation (A) and an inversion (B).
(C) Six breakpoints (left, middle) and three rearrangements (right) in a simple insertion.
(D) Two breakpoints (left, middle) and one rearrangement (right) in a simple deletion.
On the basis of these definitions, there are four breakpoints and two rearrangements in a “simple case” with a translocation or an inversion (Figure 1A and 1B); six breakpoints and three rearrangements in a simple insertion case (Figure 1C); and two breakpoints and one rearrangement in a simple deletion case (Figure 1D). Likewise, complex cases with more rearrangements have two breakpoints per rearrangement (Figure S33 [DGAP187]21). Consequently, it is expected that each of the two genomic-reference-sequence alignments corresponds to one of two breakpoints in a rearrangement sequence (Figures 2 and 3).
Figure 3.
Formatting the Rearrangement Sequence on the Basis of BLAT Search Results with BLA(S)T Output and Next-Gen Cytogenetic Nomenclature
A representative chromosome diagram of the aligned sequence is provided for Rearrangement_A (on der(1)). The BLA(S)T Output indicates that breakage and reunion occurred at 1q23.3, positive strand, nucleotide 164,789,896 and at 8q12.2, positive strand, nucleotide 61,760,650 (top) (nucleotide numbers with a purple background in the table indicate the breakpoints, and for illustration purposes, bracket locations in the chromosome diagram are not relative to actual chromosome positions). Next-Gen Cytogenetic Nomenclature is generated with BLA(S)T Outputs for both of the rearrangement sequences. Note that the derivative chromosomes are described from pter to qter, and therefore the strand directions are not included in the nomenclature (bottom).
As an illustration of the process of aligning sequences of the rearrangements with BLAT, a hypothetical truly balanced simple translocation case with t(1;8)(q23.3;q12.2) is presented in the following steps (Figure 2) (additional hypothetical simple cases akin to those in Figures 1B–1D are provided in Figures S1–S25):
-
1.
Each of the rearrangement sequences is submitted with the default settings (Genome: human; Assembly: Feb. 2009; Query type: BLAT’s guess; Sort output: query, score; Output type: hyperlink; and the sequence name can be specified as “>sequence name”) (Figure 2, top right).
-
2.BLAT search results have 12 columns: ACTIONS, QUERY, SCORE, START, END, QSIZE, IDENTITY, CHRO, STRAND, START, END, and SPAN (Figure 2; Figures S28–S32). Herein, the interpretations of rearrangement sequences are based on these BLAT results.
-
•ACTIONS:
- Browser: a hyperlink providing a UCSC Genome Browser view of the aligned sequence; it is used for determining the chromosome bands of the breakpoint regions.
- Details: a hyperlink providing a nucleotide-to-nucleotide alignment of the query sequence to the genomic reference sequence; this feature is used for sequences where “span” and “score” values do not match and “identity” is not 100%.
-
•QUERY: the name of the submitted query; if a name is not provided, the default display is “YourSeq.”
-
•SCORE: a result that correlates with the number of matching nucleotides in the query sequence to the reference sequence (for a detailed description, see “Replicating web-based Blat percent identity and score calculations” in the Web Resources).
-
•START: the position of the first nucleotide in the query sequence aligned to the reference sequence.
-
•END: the position of the last nucleotide in the query sequence aligned to the reference sequence.
-
•QSIZE: the total number of nucleotides in the query sequence.
-
•IDENTITY: a result that correlates with the ratio of matching nucleotide number in the aligned query sequence to the total nucleotide number in the aligned query sequence and the corresponding genomic reference sequence (for a detailed description, see “Replicating web-based Blat percent identity and score calculations” in the Web Resources).
-
•CHRO: the chromosome to which the query sequence aligns.
-
•STRAND: the orientation (i.e., positive or negative strand) of the query sequence relative to the positive strand of the genomic reference sequence. If the query strand is positive, the start nucleotide of the query corresponds to the start nucleotide of the aligned genomic reference sequence and vice versa. However, if the query strand is negative, the start nucleotide of the query corresponds to the end nucleotide of the genomic reference sequence and vice versa (Figures 6 and 7; Supplemental Data).
-
•START: the start nucleotide of the aligned genomic reference sequence.
-
•END: the end nucleotide of the aligned genomic reference sequence.
-
•SPAN: the span in nucleotides of the aligned genomic reference sequence.
-
•
-
3.A single rearrangement sequence is expected to have two genomic-reference-sequence alignments. In the event of more than two alignments, the steps below should be followed (Figure 2):
-
•The highest score alignment with the lowest query start point is identified. Then, IDENTITY and SPAN are checked. This sequence would represent the first aligned part of the rearrangement (query) sequence, and the breakpoint is identified on the basis of the strand orientation. In the case of t(1;8)(q23.3;q12.2), the first part of Rearrangement_A (on der(1)) aligns to chromosome 1 (Figure 2, bottom right).
-
•The highest score alignment with the highest query end point is identified. Then, IDENTITY and SPAN are checked. This sequence would represent the second aligned part of the rearrangement (query) sequence, and the breakpoint is identified on the basis of the strand orientation. In the case of t(1;8)(q23.3;q12.2), the second part of Rearrangement_A (on der(1)) aligns to chromosome 8 (Figure 2, bottom right).
-
•
-
4.
Obtaining the true alignment results of a single rearrangement sequence does not provide information for deletions or duplications at the breakpoints of a translocation, an inversion, an insertion, or a more complex rearrangement event. For this information, alignment results of all rearrangement sequences involved in a single case should be analyzed, and the continuity of breakpoint nucleotides with corresponding sequence direction should be assessed (Figures S26 and S27).
-
5.
Details for the rearrangement sequences encompassing additional events, such as an insertion or a nontemplated sequence in the vicinity of the break and reunion, are provided in the Recommendations, as well as in the Supplemental Data.
Figure 6.
BLA(S)T Outputs for Nontemplated Sequences
The nontemplated sequence of t(1;5)(p22.3;q14.3) on der(5) represents a repeat of nine nucleotides and an additional A nucleotide (A). The nontemplated sequence from inv(8)(q11.21q24.33) represents a quadruple repeat of seven nucleotides (B). In the rearrangement-sequence formats, nontemplated sequences are indicated in black, and repeated sequences are highlighted within parentheses. In the tables, nucleotide numbers with a purple background indicate the breakpoints.
Figure 7.
BLA(S)T Output for an Inverted Repeat as a Nontemplated Sequence
In a complex case with t(3;8)(q25.32;q24.21), the nontemplated sequence and nucleotides 663–684 of the rearrangement sequence correspond to nucleotides 128,564,042–128,564,063 of the chromosome 8 reference sequence in an opposite orientation (in the table, nucleotide numbers with a purple background indicate the breakpoints) (A). This result is obtained through BLAST alignment of the nontemplated sequence to the rearrangement sequence, revealing an identical match between the two sequences, excluding the self-alignment (arrows show the opposite strand directions) (B).
Recommendations
ISCN 2013 should be followed both for the general principles of structural-rearrangement nomenclature and for the use of cytogenetic symbols and abbreviated terms.13 Recommendations given below are suggested additions to current guidelines and are based on the perspective of DNA sequencing of structural chromosome abnormalities derived from our efforts to describe sequencing results from DGAP cases with apparently balanced rearrangements20–24 and from tumor genomes with somatic rearrangements (Z.O., J.F.G., M.T., and C.C.M., unpublished data). Wherever germane, ISCN 2013 criteria are integrated.
Describing the Sequencing Method
The descriptive narrative or interpretation should include the details of sequencing and confirmation method (if applicable) and whether the sequencing represents the entire genome (similar to ISCN 2013 for microarray nomenclature).
Choosing the Rearrangement-Sequence Strand to be Reported
An example of the hypothetical t(1;8)(q23.3;q12.2) case (akin to that in Figure 4C; the hypothetical inversion and p to q translocation cases as in Figure 4B and 4D are provided in Figures S5–S12) is as follows:
der(1):
(+) strand: 1pter→1q23.3::8q12.2→8qter
(−) strand: 8qter→8q12.2::1q23.3→1pter
Rearrangement_A should be reported with the use of the positive (+) strand.
der(8):
(+) strand: 8pter→8q12.2::1q23.3→1qter
(−) strand: 1qter→1q23.3::8q12.2→8pter
Rearrangement_B should be reported with the use of the positive (+) strand.
Figure 4.
Strand Direction and Identity of the Derivative Chromosome Designation by the Chromosome Segment that Includes the Centromere
(A) Structurally normal chromosomes.
(B) Pericentric inversion.
(C) q to q translocation.
(D) p to q translocation.
The dotted arrow indicates the positive (+) strand direction of the corresponding chromosome segment, and the thick arrow indicates the positive strand direction of the chromosome.
The DNA molecule consists of two strands with opposite directions. The positive (+) strand of the genomic reference sequence is designated as the sequence that goes from the terminal end of the short arm (pter, the first nucleotide of a chromosome) to the terminal end of the long arm (qter, the last nucleotide of a chromosome), i.e., increasing numbers of nucleotides, and vice versa is true for the negative (−) strand (Figure 4A). Although nucleotide numbers increase starting from pter, band designations increase numerically on both p and q arms by going outward from the centromere. The orientations described herein are according to nucleotide numbers that are in parallel with the positive strand direction, but not necessarily with band orientations.
ISCN 2013 describes a derivative chromosome as a structurally rearranged chromosome that has an intact centromere. We propose that directionality of the derivative chromosome be defined by the directionality of the chromosome segment that includes the centromere as present in the human genomic reference sequence, regardless of the directionality of other segments in the derivative chromosome (Figure 4; Supplemental Data).
The rearrangement sequence should be reported with the use of the positive strand of the derivative chromosome for ease of interpretation. Each rearrangement sequence within a case is designated with a letter (e.g., Rearrangement_A, Rearrangement_B, …, Rearrangement_AA, Rearrangement_AB, etc.). Assignment of letters to rearrangement sequences is random (although in alphabetic order) because the letters are designated prior to interpretation of the rearrangement-sequence alignments.
BLA(S)T Output and the Rearrangement-Sequence Format
BLA(S)T Output is not the final nomenclature for a chromosome aberration; rather, it is the interpretation of an individual rearrangement sequence. For constructing the suggested Next-Gen Cytogenetic Nomenclature, the information provided in BLA(S)T Outputs of individual rearrangement sequences are combined to describe the rearranged chromosomes from pter to qter. If BLA(S)T Outputs are listed separately from the nomenclature (e.g., in a tabular format), genome build numbers should be indicated.
The following is an example of the hypothetical t(1;8)(q23.3;q12.2) case:
Rearrangement_A (on der(1)) (Figure 3, top)
BLA(S)T Output: 1q23.3(+)(164,789,896)::8q12.2(+)(61,760,650)
Breakage and reunion have occurred at 1q23.3, positive strand, nucleotide 164,789,896 and 8q12.2, positive strand, nucleotide 61,760,650.
Rearrangement_B (on der(8))
BLA(S)T Output: 8q12.2(+)(61,760,649)::1q23.3(+)(164,789,897)
Breakage and reunion have occurred at 8q12.2, positive strand, nucleotide 61,760,649 and 1q23.3, positive strand, nucleotide 164,789,897.
The result of aligning a rearrangement sequence with an alignment tool is referred to as “BLA(S)T Output” because it can be obtained with the use of either BLAT or BLAST. In designating the BLA(S)T Output, three major components are required: (1) details of the first aligned part of the query, (2) break and reunion (::), and (3) details of the second aligned part of the query. Each item within a component is written without any spaces in between. In detail, the BLA(S)T Output is recommended to be formulated as follows (italic text in parentheses is used here for description of the major components and is not part of the format):
BLA(S)T Output: (first aligned part of the query)chromosome with band(strand)(breakpoint nucleotide number on the corresponding genomic reference sequence)(break and reunion)::(second aligned part of the query)chromosome with band(strand)(the breakpoint nucleotide number on the corresponding genomic reference sequence)
At the discretion of the laboratory, the rearrangement sequence may be color coded according to the determined matching alignments; also, nucleotides may be numbered and grouped in tens with 60 nucleotides in a row, as illustrated in Figures 3, 5, 6, and 7.
Figure 5.
Next-Gen Cytogenetic Nomenclature of a Case with a Duplication on Chromosome 8 in the Vicinity of a t(2;8)
When the rearrangement sequences are not known genome-wide, the nomenclature includes the BLA(S)T Output instead of describing the derivative chromosome from pter to qter; therefore, this nomenclature includes the strand directions (curly brackets indicate homology, and nucleotide numbers with a purple background in the table indicate the breakpoints).
General Principles of Next-Gen Cytogenetic Nomenclature
The following is an example of the hypothetical t(1;8)(q23.3,q12.2) case (Figure 3, bottom):
Short system: seq[GRCh37/hg19] t(1;8)(q23.3;q12.2)dn
Detailed system: seq[GRCh37/hg19] t(1;8)(1pter→1q23.3(164,789,896)::8q12.2(61,760,650)→8qter;8pter→8q12.2(61,760,649)::1q23.3(164,789,897)→1qter)dn
Method
To specify that the results are obtained through sequencing, “seq” should be included at the beginning of the nomenclature (akin to using “arr” for microarrays and “ish” for in situ hybridization [ISCN 2013]).
Reference-Genome Build
Both Genome Reference Consortium human build (GRCh) and human genome (hg) numbers are included at the beginning of the nomenclature within brackets ([ ]) and separated by a forward slash (/). Although “seq” precedes the bracket without a space, a space is present after the bracket (as in ISCN 2013 for microarray nomenclature).
Order of Chromosomes
If one of the rearranged chromosomes is a sex chromosome, it is described first. Rearranged autosomes are listed in numerical order.
Short and Detailed Systems
ISCN 2013 has two systems to designate structural abnormalities. Whereas the short system only provides breakpoint band designations of involved chromosomes (e.g., t(12;14)(q13;p11)), the detailed system describes each of the rearranged chromosomes from pter to qter (e.g., t(12;14)(12pter→12q13::14p11→14pter;12qter→12q13::14p11→14qter). We recommend describing each abnormality first by using the short system and then by using the detailed system for Next-Gen Cytogenetic Nomenclature (and listing nucleotide numbers only in the latter).
Nucleotide Numbers
Nucleotide numbers are specified within parentheses after the band designation of the abnormal region. If rearrangements are within a single chromosome (e.g., deletion, inversion, etc.), the chromosome number is not specified in the band designation. Commas are recommended to facilitate reading nucleotides and are placed every three digits for numbers comprising four or more digits (similar to ISCN 2013 for ish and microarray nomenclature).
Inheritance and Constitutionality
The inheritance symbols (dn [de novo], mat [maternal], and pat [paternal]) are used after the description of the rearrangement without a space after the last parenthesis. If the inheritance symbol follows another abbreviation, a space should be inserted (as in ISCN 2013).
When acquired and constitutional aberrations are found in the same case, the latter is indicated by “c” directly after the designation of the constitutional abnormality without a space. Note that if the inheritance is known, “mat” or “pat” takes the place of the “c” (similar to ISCN 2013).
Missing Whole-Genome Information
If a rearrangement sequence(s) is known without knowledge of additional information of any other chromosomal aberrations (e.g., from previous publication, capillary sequencing confirmation of a single rearrangement sequence, etc.), the BLA(S)T Output can be incorporated after the description of the event, for example14 (Figure 5),
BLA(S)T Output: 8q24.21(+)(128,749,16{0})::8q24.21(+)(128,746,67{6})
This shows an approximately 2.5 kb direct duplication (if it were inverted, the second alignment would be in the opposite strand orientation) described in a previously reported case in the vicinity of a t(2;8).
Next-Gen Cytogenetic Nomenclature: seq[GRCh37/hg19] dup(8)(q24.21(+)(128,749,16{0})::q24.21(+)(128,746,67{6}))
Because the additional rearrangement sequences are not provided and the sequencing is not genome-wide, the nomenclature includes the BLA(S)T Output only for the duplication event with known sequence instead of describing der(8) from pter to qter. For the interpretation of homologies and curly brackets, see “Homology.”
Combining Sequencing Results with Other Cytogenetic Methods
To separate various cytogenetic observations, ISCN 2013 recommends using a period followed by the symbol of the method (ish, arr, etc.) and the result of the specified method. The G-banded karyotype is specified first without a symbol. The same approach can be applied while combining sequencing results with other cytogenetic methods.
Examples are as follows:
Short system: 46,XX,t(5;10)(p13.3;q21.1)dn.seq[GRCh37/hg19] t(5;10)(p13.3;q21.3)dn
Detailed system: 46,XX,t(5;10)(p13.3;q21.1)dn.seq[GRCh37/hg19] t(5;10)(10qter→10q21.3(67,539,997)::5p13.3(29,658,440)→5qter;10pter→10q21.3(67,539,990)::5p13.3(29,658,426)→5pter)dn
This shows whole-genome sequencing detecting a translocation observed in the G-banded karyotype. Note that the cytogenetic-band assignments are derived from G-banded chromosomes and that the sequencing-band assignments are derived from genome browsers; therefore, they are not always concordant with each other (as in ISCN 2013 for microarray nomenclature). In addition, the breakpoint nucleotides of 67,539,990 and 67,539,997 and their sequence directions indicate a 6 bp deletion on chromosome 10, and the breakpoint nucleotides of 29,658,426 and 29,658,440 and their sequence directions indicate a 13 bp deletion on chromosome 5. For the sake of simplicity, in the detailed system, the short description of the G-banded karyotype is combined with the detailed description of the Next-Gen Cytogenetic Nomenclature.
Short system: ish t(9;22)(ABL1+,BCR+;BCR+,ABL1+)[20].seq[GRCh37/hg19] t(9;22)(q34.1;q11.23)
Detailed system: ish t(9;22)(ABL1+,BCR+;BCR+,ABL1+)[20].seq[GRCh37/hg19] t(9;22)(9pter→9q34.1(133,643,307)::22q11.23(23,632,613)→22qter;22pter→22q11.23(23,632,612)::9q34.1(133,643,308)→9qter)
This shows in situ hybridization and sequencing detecting a t(9;22).
Short system: 46,XX.arr[hg19] 12p12.1(21,340,001–23,500,000)×1.seq[GRCh37/hg19] del(12)(p12.1p12.1)
Detailed system: 46,XX.arr[hg19] 12p12.1(21,340,001–23,500,000)×1.seq[GRCh37/hg19] del(12)(pter→p12.1(21,340,000)::p12.1(23,500,001)→qter)
This shows microarray and sequencing detecting a deletion that was not observed in the G-banded karyotype.
Homologous Chromosomes
If a rearrangement involves homologous chromosomes, one of the chromosome numerals should be underlined, for example,
seq[GRCh37/hg19] t(9;9)(9qter→9q22.33(102,425,452):: 9p21.2(26,393,002)→9qter;9pter→9q22.33(102,425,451)::9p21.2(26,393,001)→9pter)
This shows a translocation between the short arm of chromosome 9 and the long arm of its homolog.
Homology
Some nucleotide(s) might belong to either chromosome at the breakpoints (i.e., a homology), and thus it is not possible to determine their chromosome of origin. The smallest digits of the nucleotide number best specifying the homology are placed in curly brackets ({ }), and the first and last nucleotides are separated by a dash (–). The first and last nucleotides are placed in the order in which they occur in the rearranged sequence and not in numerical order. If a segment is of sufficient length to change the band assignment, the start and end bands are included without a dash, e.g., 4q31.3(151,513,239–151,513,560) versus 4q31.3q32.1(151,513,239–156,815,165) as in ISCN 2013 for microarray nomenclature.
Examples are as follows:
seq[GRCh37/hg19] der(5)(10qter→10q21.3(67,539,99{7–5})::5p13.3(29,658,44{0–2})→5qter)
An unbalanced translocation between the short arm of chromosome 5 and the long arm of chromosome 10 results in a der(5) with a 3 nt homology at the break and reunion.
seq[GRCh37/hg19] der(5)(10qter→10q21.3(67,539,9{97–85})::5p13.3(29,658,4{40–52})→5qter)
A der(5) has a 13 nt homology at the break and reunion.
seq[GRCh37/hg19] der(5)(10qter→10q21.3(67,5{40,000–39,997})::5p13.3(29,658,44{0–3})→5qter)
A der(5) has a 4 nt homology at the break and reunion.
Nontemplated Sequence and CNVs
Nucleotide(s) that do not align to a reference chromosome are defined as nontemplated sequence. Note that BLAT lists the most relevant results to identify the complete rearrangement sequence; therefore, a sequence that appears as nontemplated in the BLAT results of a rearrangement sequence might align to a reference chromosome when submitted separately (20 is the minimum number of nucleotides for BLAT use), or it might be a small repeat of sequence in the vicinity of the breakpoint.
In order to avoid an unwieldy description, we provide only BLA(S)T Outputs of single rearrangement sequences in the examples. Note that for nontemplated sequences, CNVs, and homologies, the same rules apply to both BLA(S)T Output and Next-Gen Cytogenetic Nomenclature.
Nontemplated Sequence Other Than Single-Nucleotide Change or Mutation
If the nontemplated nucleotide represents a SNP or a mutation, current guidelines for reporting such alterations (e.g., from the Human Genome Variation Society) are used. Otherwise, the unmapped nucleotide(s) should be reported after the break and reunion symbol (::) and should be followed by another break and reunion symbol (::) without any space. If the nontemplated sequence is longer than 50 nucleotides, then the first and last three nucleotides of the nontemplated sequence may be provided in the nomenclature after an ellipsis (…), followed by the number of nucleotides in curly brackets ({ }). The complete nontemplated sequence needs to be included in the report for future reference.
Examples are as follows:
11q14.2(+)(87,665,449)::TTC::2q32.1(−)(186,039,460)
A nontemplated sequence of three nucleotides (TTC) is located between the breakpoints.
11q14.2(+)(87,665,449)::TTC…ACG{75}::2q32.1(−)(186,039,460)
A nontemplated sequence of 75 nucleotides (starting with TTC and ending with ACG) is located between the breakpoints.
Nontemplated Sequence as a Repeat
At the discretion of the reporting laboratory, it may be determined whether the nontemplated sequence is a repeat of sequence in the vicinity of the breakpoint. A practical way to detect repeats is to align the nontemplated sequence (query) to the rearrangement sequence (subject) with BLAST, especially if the nontemplated sequence is fewer than 20 nucleotides (see examples under “CNVs” below).
CNVs
When CNVs are reported, the range should be written in parentheses in sequential order (not in numerical order) and separated by a dash (–). If the segment is of sufficient length to change the band assignment, the start and end chromosome bands are included without a dash. If the repeats are identical and consecutive in the sequence, the number of repeats should be indicated after a multiplication sign (×) following the parenthesis for the nucleotide designation without a space (as in ISCN 2013 for microarray nomenclature).
Examples are as follows:
This is the alignment result of a rearrangement sequence on der(5) of a t(1;5)(p22.3;q14.3) (Figure 6A; Figure S49 [DGAP131]21):
BLA(S)T Output: 5q14.3(+)(88,829,562)::GTCTCCAGGA::1p22.3(−)(86,157,132)
or
BLA(S)T Output: 5q14.3(+)(88,829,562)::5q14.3(88,829,550–88,829,558)::A::1p22.3(−)(86,157,132)
Nucleotides 39–47 of the rearrangement (query) sequence are the repeat of nucleotides 26–34 of the rearrangement (query) sequence and nucleotides 88,829,550–88,829,558 of the chromosome 5 reference sequence. Nucleotide 48 of the rearrangement (query) sequence is not part of the repeat; therefore, it is written as “A.”
This is the alignment result of a rearrangement sequence from an inv(8)(q11.21q24.33) (Figure 6B; DGAP247, Z.O., J.F.G., M.T., and C.C.M., unpublished data):
BLA(S)T Output: 8q11.21(−)(51,889,502)::TATTCTTTATTCTTTATTCT::8q24.23(+)(136,495,815)
or
BLA(S)T Output: 8q11.21(−)(51,889,502)::8q24.23(136,495,816–136,495,822)×4::8q24.23(+)(136,495,823)
Nucleotides 89–109 of the rearrangement (query) sequence are three repeats of nucleotides 110–116 of the rearrangement (query) sequence and nucleotides 136,495,816–136,495,822 of the chromosome 8 reference sequence. Note that nucleotide 109 (T) of the query is interpreted to be included in the nontemplated repeat region for the least complicated explanation of the repeat at the break and reunion. Therefore, the BLA(S)T Output of the nucleotide of the last part of the query is revised from nucleotide 136,495,815 to 136,495,816 of the reference sequence when the repeat is reported.
When the nontemplated sequence (TATTCTTTATTCTTTATTCT) is aligned with BLAT, it results in an alignment on chromosome 7. Considering that there is no indication of involvement of chromosome 7 on the basis of either G-banded karyotype or sequencing data, this scenario is interpreted to be incorrect.
The following is a hypothetical case with a CNV:
seq[GRCh37/hg19] dup(8)(pter→q11.21(51,889,502)::q11.21q24.23(51,889,503–136,495,822)×2::q24.23(136,495,823)→qter)
This case has a duplication of the indicated region. Note that in contrast to the array or in situ hybridization nomenclature, the nomenclature here does not include the copy on the unaltered chromosome 8 in this multiplication sign.
Nontemplated Sequence as a Chromosome Segment Involved in a Rearrangement
If a direct repeat is not identified, another consideration is whether the nontemplated sequence is a chromosome segment (e.g., an inverted end or inverted repeat) that might be involved in a rearrangement. Additional cytogenetic observations (e.g., G-banded karyotype or metaphase ish) might be helpful for the interpretation of such results.
The following is an example of a rearrangement sequence on a der(8) of a complex case with more than two rearrangement sequences, including a translocation between the long arms of chromosomes 3 and 8 (Figure 7; Z.O., J.F.G., M.T., and C.C.M., unpublished data):
BLA(S)T Output: 3q25.32(+)(158,573,186)::ACCATGTTTGTAATTTCATTGC::8q24.21(−)(128,564,075)
or
BLA(S)T Output: 3q25.32(+)(158,573,186)::8q24.21(128,564,042–128,564,063)::8q24.21(−)(128,564,075)
The nontemplated sequence is an inverted repeat of the designated nucleotides. This inverted repeat is revealed through BLAT alignment of the nontemplated sequence (because the query size is larger than 20 nucleotides). This result is also confirmed through BLAST alignment of the nontemplated sequence (ACCATGTTTGTAATTTCATTGC) as the “query” and of the complete rearrangement sequence as the “subject” (Figure 7B).
Note that in a simple translocation between the long arms of two chromosomes, both of the aligned sequences are expected to have the same orientation (Figure 4C). However, this case is a complex case with more than two rearrangement sequences; therefore, the orientations of the two aligned sequences are in opposite directions.
Repetitive Elements
If the rearrangement sequence aligns to a repetitive element with multiple identical matches at different locations, in addition to a comment in the narrative report, two approaches may be followed at the discretion of the laboratory:
-
1.
When supported by the other breakpoints and/or cytogenetic methods, the reference chromosome and nucleotide numbers of the repetitive element may be chosen on the basis of the alignment that is on the relevant chromosome location (band and/or nucleotide level).
-
2.
If the alignment is ambiguous and one region is not more relevant to the rearrangement than the others, the repetitive region may be reported as a nontemplated sequence (see “Nontemplated Sequence Other Than Single-Nucleotide Change or Mutation”).
Uncertain Breakpoint Localization
An approximation sign (∼) can be used for rearrangements estimated to be within a nucleotide range, but not able to be delineated to the single-nucleotide level. Nucleotide ranges are listed in the order of the sequence.
Examples are as follows:
seq[GRCh37/hg19] der(3)(3pter→3q25.32(158,573,186)::8q24.21(128,534,000∼128,546,000)→8qter)
An unbalanced translocation between the long arms of chromosomes 3 and 8 results in a der(3) with an estimated nucleotide range for the breakpoint on chromosome 8.
seq[GRCh37/hg19] der(3)(8qter→8q24.21(128,546,000∼128,534,000)::3q25.32(158,573,186)→3qter)
An unbalanced translocation between the short arm of chromosome 3 and the long arm of chromosome 8 results in a der(3) with an estimated nucleotide range for the breakpoint on chromosome 8.
seq[GRCh37/hg19] der(3)(3pter→3q25.32(158,573,186∼)::8q24.21(128,534,288∼)→8qter)
An unbalanced translocation between the long arms of chromosomes 3 and 8 results in a der(3) with an estimated nucleotide number for both of the breakpoints because the rearrangement sequence is not able to be confirmed with capillary sequencing.
Complex Cases
The symbol “cx” (similar to ISCN 2013) can be used for three or more complex rearrangements across at least two chromosomes. Note that there is no space when a symbol or abbreviation precedes or follows a parenthesis; therefore, the parenthesis for the involved chromosomes immediately follows the bracket, for example,
seq[GRCh37/hg19](2,3,8)cx
These complex rearrangements involve chromosomes 2, 3, and 8.
Chromothripsis
The symbol “cth” can be used for chromothripsis referring to complex events along a chromosome or chromosome segment (similar to ISCN 2013 for complex array results) and is recommended for three or more rearrangements, for example,
seq[GRCh37/hg19] 8q13.3q21.11(72,332,704–76,296,139)cth
Sequencing reveals multiple events in chromosome 8 between bands q13.3 and q21.1 and nucleotides 72,332,704 and 76,296,139.
Constructing Derivative Chromosomes from pter to qter
If possible, in the detailed system, each of the derivative chromosomes should be described from pter to qter, as recommended in the general principles (see “General Principles of Next-Gen Cytogenetic Nomenclature”). This description would not only provide a cytogenetic point of view for these complex rearrangements but also assist in understanding possible position effects (i.e., which chromosome segments, genes, or regulatory elements are in proximity on the derivative chromosomes). The nomenclature begins with seq and the GRCh and hg numbers, followed by the rearrangement type (cx or cth) and then a comma with a description of each derivative chromosome (see Figure S33 [DGAP187]21 for detailed instructions).
Here is an example (Figure S33 [DGAP187]21):
Short system: 46,XX,t(6;13)(q21;q32)dn.seq[GRCh37/hg19](6,13)cx,der(6)t(6;13)(q14.3;q31.1)dn,der(13)t(6;13)inv(6)(q14.3q14.3)dn
Detailed system: 46,XX,t(6;13)(q21;q32)dn.seq[GRCh37/hg19](6,13)cx,der(6)(6pter→6q14.3(85,897,870)::A::13q31.1(80,659,609) →13qter)dn,der(13)(13pter→13q31.1(80,659,606)::6q14.3(85,900,543–86,488,29{1})::6q14.3(85,900,54{0}–85,897,899)::6q14.3(93,909,993)→6qter)dn
This is a complex case with four rearrangements (one in der(6) and three in der(13)). The der(6) is derived from a translocation between 6q14.3 and 13q31.1. The der(13) has the same translocation, in addition to an inversion at 6q14.3. There is a nontemplated nucleotide in the der(6) and a single-nucleotide homology in the der(13). Note that in addition to multiple base-pair-scale nucleotide imbalances, there is an approximately 7.4 Mb deletion between nucleotides 86,488,291 and 93,909,993 because the segment ending with 6q14.3 (86,488,29{1}) going toward the centromere is the closest to the segment starting with 6q14.3 (93,909,993) going toward qter on the basis of the reference chromosome 6 sequence.
Describing Rearrangements with BLA(S)T Outputs
If it is not possible to construct derivative chromosomes from pter to qter (e.g., because of too many complex events, missing breakpoints, etc.) or as an alternative, the BLA(S)T Outputs of the rearrangement sequences may be used to describe individual breakpoints and are separated by commas. BLA(S)T Outputs are listed starting with sex chromosomes, followed by autosomes; breakpoint nucleotides of the first aligned part of the query are provided in reference-chromosome sequential order. Nomenclature begins with seq and the GRCh and hg numbers, followed by the abbreviated rearrangement description (cx or cth). The inheritance symbol at the end of a string indicates that the whole string has the provided inheritance (similar to ISCN 2013 for microarray nomenclature).
Here is an example (Figure S33 [DGAP187]21):
Short system: 46,XX,t(6;13)(q21;q32)dn.seq[GRCh37/hg19](6,13)cx dn
Detailed system: 46,XX,t(6;13)(q21;q32)dn.seq[GRCh37/hg19](6,13)cx,6q14.3(+)(85,897,870)::A::13q31.1(+)(80,659,609),6q14.3(−)(85,897,899)::6q14.3(+)(93,909,993),6q14.3(+)(85,900,54{0})::6q14.3(−)(86,488,29{1}),6q14.3(−)(85,900,543)::13q31.1(−)(80,659,606)dn
This is the same complex case described in the previous example (DGAP18721). Instead of describing the derivative chromosomes, it lists all of the BLA(S)T Outputs (separated by commas) in the reference-chromosome sequential order of the first parts of the BLA(S)T Outputs. The BLA(S)T Output starting with 6q14.3(+)(85,897,870) is followed by the BLA(S)T Output starting with 6q14.3(−)(85,897,899), then 6q14.3(+)(85,900,540), and lastly 6q14.3(−)(85,900,543). For preserving reference-chromosome sequential order, the reverse complement of a BLA(S)T Output can be used (e.g., 6q14.3(−)(85,900,543)::13q31.1(−)(80,659,606) vs. 13q31.1(+)(80,659,606)::6q14.3(+)(85,900,543); see Figure S33 [DGAP187]21 for more details). The “dn” at the end of the string indicates that the whole string is de novo. Note that in the short system, the case is only described as “cx” because the derivative chromosomes are not described from pter to qter.
In highly complex cases, it might be convenient to display results by using Next-Gen Cytogenetic Nomenclature in tabular format instead of a string (Table S1) (similar to ISCN 2013 for microarray nomenclature).
Further Examples and BOSToN
DGAP represents a large collection of individuals with abnormal phenotypes and constitutional balanced chromosome rearrangements. Next-Gen Cytogenetic Nomenclature of a selection of previously published sequenced DGAP cases and of additional hypothetical simple cases is provided in Table S2 and the Supplemental Data.
To facilitate implementation of this nomenclature, we are developing a program entitled BOSToN (BLA(S)T Output Sequence Tool of Nomenclature) in the spirit of previous landmark conferences in cytogenetics (e.g., Denver [1960], London [1963], Chicago [1966], and Paris [1971, 1975]). BOSToN is a web-based program that employs the Next-Gen Cytogenetic Nomenclature recommendations. The goal of BOSToN is to obtain the BLA(S)T Outputs, result tables, and Next-Gen Cytogenetic Nomenclature by entering FASTA format rearrangement sequences. All data are currently acquired from BLAST, and therefore, the result tables presented in BOSToN are in correlation with BLAST display and provide the BLAST score, expectation value, identity, gap, chromosome number, direction, subject (reference chromosome sequence) start and end, query (submitted rearrangement sequence) start and end, and total numbers. An illustrative version of BOSToN, along with a list of DGAP cases and their sequences for demonstration of the tool, is embedded in a publicly available website (see Web Resources). For use in clinical or research laboratories, a variety of improvements in BOSToN are planned.
Discussion
It has become increasingly evident that delineating structural chromosome rearrangements is of clinical significance,20–31 and sequencing such aberrations has already entered clinical practice.20,31,36,37 The lack of a consensus system to describe these rearrangements at the molecular level could lead to miscommunication of clinical and research results. We suggest Next-Gen Cytogenetic Nomenclature, a system analogous with ISCN 2013, and are developing BOSToN as an online tool to facilitate its implementation.
Despite the current necessity of a nomenclature system, sequencing results of structural rearrangements have been available since the 1980s. The example provided in “Missing Whole-Genome Information” represents an illustration of applying Next-Gen Cytogenetic Nomenclature to results obtained with historical sequencing methods in a Burkitt lymphoma reported in 1984.14 Current next-generation sequencing technologies are able to provide genome-wide information for both unbalanced and balanced structural rearrangements. Consequently, the suggested nomenclature is designed to describe both rearrangement types. Although unbalanced rearrangements might be the result of a single structural-rearrangement event (e.g., deletion, duplication, addition, amplification, or a single derivative of a simple translocation), they might also result from gains or losses accompanying translocations or inversions or even more complex rearrangements. For the former, the imbalance might be interpreted through the symbolic description of the event (e.g., the description of a duplication case would start with “dup” and would therefore imply the imbalance). For the latter, the direction of the aligned sequences, along with the continuity of the breakpoint nucleotides of the rearrangements based on the reference chromosome sequence, needs to be assessed (as described in Figures S26 and S27).
With rapid advances in genomic technologies, the exponential increase in sequencing data for structural chromosome aberrations introduces new concepts in genetics, such as chromothripsis. Next-Gen Cytogenetic Nomenclature can also be applied to complex chromosomal rearrangements, including chromothripsis. Nomenclature for sequencing results of a cancer cell line reported in the first publication of chromothripsis29 (seq[NCBI36/hg18] 5p15.33q34(1,391,571–167,283,889)cth) is listed in Table S1.
Evolving knowledge generated from next-generation sequencing will most likely require ongoing modifications to the nomenclature described herein. One such issue is the human genomic variant regions that are described as “alternate loci (alt loci)” by the Genome Reference Consortium. The recently released GRCh38 has 261 alt loci scaffolds in 35 alternate assembly units. A structural rearrangement within the region of an alt locus or the presence of the alt locus in an individual might be of clinical importance in assessing the pathogenicity of a rearrangement. In such instances, a possible consideration is to add the extension of alt loci assembly units after the GRCh build number and separate them with an underscore (e.g., [GRCh37_ALT_REF_LOCI_1/hg19] represents the use of the first alternate sequence on chromosome 6 [major histocompatibility complex region] in GRCh37). Another issue is the combination of the nomenclature of sequence variants other than structural rearrangements (e.g., single-nucleotide substitutions) and the structural chromosome rearrangements, given that both can be obtained by next-generation sequencing.36 Another example of a future consideration is describing the results of structural aberrations at the RNA and protein levels, or in the mitochondrial genome.
The suggested nomenclature described herein is designed to provide an objective system to explain the structural rearrangements at a molecular level. Nonetheless, similar to microarray nomenclature described in ISCN 2013, it does not inform the clinical significance of these aberrations. Considering the establishment of chromosome microarrays in clinical practice,38–40 the pathogenic significance of the sequencing results of structural chromosome aberrations resulting in copy-number changes (e.g., deletions, duplications, etc.) can be interpreted akin to the evaluation of microarray results of such rearrangements with the use of the following categories: (1) pathogenic variants, (2) variants of uncertain clinical significance, and (3) benign variants. The clinical significance of balanced chromosome rearrangements (e.g., translocations and inversions) has not been extensively compared to that of CNVs or mutations. Therefore, currently available resources (e.g., publications, databases, etc.) for CNVs or mutations can be combined for the interpretation of the balanced structural rearrangements through evaluating the association of the disrupted regions and/or genes with an abnormal phenotype, as illustrated in previous DGAP publications.20–24 In addition, the data for individuals with or without an abnormal phenotype with apparently balanced chromosome rearrangements are incorporated into some existing databases (e.g., DGV41 and DECIPHER42), and new databases (e.g., InvFEST43) are emerging specifically for these rearrangements. On the basis of the evidence gathered from these resources, the clinical significance of balanced rearrangements can be assessed given the following scenarios (in increasing order of potential concern): disruption of nonpathogenic regions observed in individuals without a presumed relevant clinical phenotype, disruption of nongenic regions of uncertain clinical significance, disruption of a single gene or regulatory region of uncertain clinical significance, disruption of a single pathogenic region (recessive), disruption of a single pathogenic region (dominant), and so on. These interpretations are subject to change with additional studies, as observed in the clinical evaluation of microarray results.44 However, adding a summarized “interpretation table” including the relevant information (e.g., disruption of a pathogenic region, a list of involved genes or at least genes known to be pathogenic, evidence based on previously published cases, etc.) might be valuable for healthcare providers.
Next-Gen Cytogenetic Nomenclature provides a path for describing sequencing data of constitutional and acquired structural rearrangements obtained through historical and current sequencing techniques, thus facilitating the communication of sequencing results. The interpretation of structural rearrangements detected by sequencing and their clinical significance can be complex, warranting further development of computational algorithms to assist with this process.
Acknowledgments
We are grateful to all participating subjects and families and to the many healthcare professionals who have contributed to the Developmental Genome Anatomy Project. We especially thank Ye Cao, Richard Choy, Cary Scott Gallagher, Anne Giersch, Anita Hawkins, Linda Johnson, Nahid Robertson, and Jun Shen for their contributions during manuscript preparation. This work was supported by NIH grants (GM061354 to C.C.M. and J.F.G., HD060530 to C.C.M., HD065286 to J.F.G., and MH087123 and MH095867 to M.E.T.). B.B.C. is supported by F32 DC012466 and previously by T32 GM007748. M.E.T. and J.F.G. are supported by the Simons Foundation for Autism Research and the Nancy Lurie Marks Family Foundation.
Supplemental Data
Web Resources
The URLs for data presented herein are as follows:
BLA(S)T Output Sequence Tool of Nomenclature (BOSToN), http://boston.bwh.harvard.edu/
Database of Genomic Variants (DGV), http://dgv.tcag.ca/dgv/app/home
DECIPHER, http://decipher.sanger.ac.uk
Developmental Genome Anatomy Project (DGAP), http://www.bwhpathology.org/dgap/
GRCh38, http://www.ncbi.nlm.nih.gov/projects/genome/assembly/grc/human/
Human Genome Variation Society, http://www.hgvs.org/mutnomen/
InvFEST: Human Polymorphic Inversion Database, http://invfestdb.uab.cat/
Replicating web-based Blat percent identity and score calculations, http://genome.ucsc.edu/FAQ/FAQblat.html#blat4
UCSC Genome Browser, http://genome.ucsc.edu
References
- 1.Tjio J.H., Levan A. The chromosome number of man. Hereditas. 1956;42:1–6. [Google Scholar]
- 2.A proposed standard system of nomenclature of human mitotic chromosomes (Denver, Colorado) Ann. Hum. Genet. 1960;24:319–325. doi: 10.1111/j.1469-1809.1960.tb01744.x. [DOI] [PubMed] [Google Scholar]
- 3.Caspersson T., Farber S., Foley G.E., Kudynowski J., Modest E.J., Simonsson E., Wagh U., Zech L. Chemical differentiation along metaphase chromosomes. Exp. Cell Res. 1968;49:219–222. doi: 10.1016/0014-4827(68)90538-7. [DOI] [PubMed] [Google Scholar]
- 4.Caspersson T., Hultén M., Lindsten J., Zech L. Identification of chromosome bivalents in human male meiosis by quinacrine mustard fluorescence analysis. Hereditas. 1972;67:147–149. doi: 10.1111/j.1601-5223.1971.tb02368.x. [DOI] [PubMed] [Google Scholar]
- 5.Paris Conference (1971): Standardization in human cytogenetics. Cytogenetics. 1972;11:317–362. [PubMed] [Google Scholar]
- 6.An international system for human cytogenetic nomenclature (1978) ISCN (1978). Report of the Standing Commitee on Human Cytogenetic Nomenclature. Cytogenet. Cell Genet. 1978;21:309–409. doi: 10.1159/000130909. [DOI] [PubMed] [Google Scholar]
- 7.Report of the Standing Committee on Human Cytogenetic Nomenclature An international system for human cytogenetic nomenclature—high-resolution banding (1981). ISCN (1981) Cytogenet. Cell Genet. 1981;31:5–23. doi: 10.1159/000131621. [DOI] [PubMed] [Google Scholar]
- 8.Report of the Standing Committee on Human Cytogenetic Nomenclature An International System for Human Cytogenetic Nomenclature (1985) ISCN 1985. Birth Defects Orig. Artic. Ser. 1985;21:1–117. [PubMed] [Google Scholar]
- 9.Mitelman F., editor. ISCN 1991: Guidelines for Cancer Cytogenetics: Supplement to an International System for Human Cytogenetic Nomenclature: Recommendations of the Standing Committee on Human Cytogenetic Nomenclature. Karger; Basel: 1992. [Google Scholar]
- 10.Mitelman F., editor. ISCN 1995: An International System for Human Cytogenetic Nomenclature (1995): Recommendations of the international standing committee on human cytogenetic nomenclature. S. Karger; Basel: 1995. [Google Scholar]
- 11.Shaffer L.G., Tommerup N., editors. ISCN 2005: An International System for Human Cytogenetic Nomenclature (2005): Recommendations of the International Standing Committee on Human Cytogenetic Nomenclature. Karger; Basel: 2005. [Google Scholar]
- 12.Shaffer L.G., Slovak M.L., Campbell L.J., editors. ISCN 2009: An International System for Human Cytogenetic Nomenclature (2009) Karger; Basel: 2009. [Google Scholar]
- 13.Shaffer L.G., McGowan-Jordan J., Schmid M., editors. ISCN 2013: An International System for Human Cytogenetic Nomenclature (2013) Karger; Basel: 2013. [Google Scholar]
- 14.Taub R., Kelly K., Battey J., Latt S., Lenoir G.M., Tantravahi U., Tu Z., Leder P. A novel alteration in the structure of an activated c-myc gene in a variant t(2;8) Burkitt lymphoma. Cell. 1984;37:511–520. doi: 10.1016/0092-8674(84)90381-7. [DOI] [PubMed] [Google Scholar]
- 15.Bernard O., Cory S., Gerondakis S., Webb E., Adams J.M. Sequence of the murine and human cellular myc oncogenes and two modes of myc transcription resulting from chromosome translocation in B lymphoid tumours. EMBO J. 1983;2:2375–2383. doi: 10.1002/j.1460-2075.1983.tb01749.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Bentley D.R., Balasubramanian S., Swerdlow H.P., Smith G.P., Milton J., Brown C.G., Hall K.P., Evers D.J., Barnes C.L., Bignell H.R. Accurate whole human genome sequencing using reversible terminator chemistry. Nature. 2008;456:53–59. doi: 10.1038/nature07517. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Korbel J.O., Urban A.E., Affourtit J.P., Godwin B., Grubert F., Simons J.F., Kim P.M., Palejev D., Carriero N.J., Du L. Paired-end mapping reveals extensive structural variation in the human genome. Science. 2007;318:420–426. doi: 10.1126/science.1149504. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Chen W., Kalscheuer V., Tzschach A., Menzel C., Ullmann R., Schulz M.H., Erdogan F., Li N., Kijas Z., Arkesteijn G. Mapping translocation breakpoints by next-generation sequencing. Genome Res. 2008;18:1143–1149. doi: 10.1101/gr.076166.108. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Talkowski M.E., Ernst C., Heilbut A., Chiang C., Hanscom C., Lindgren A., Kirby A., Liu S., Muddukrishna B., Ohsumi T.K. Next-generation sequencing strategies enable routine detection of balanced chromosome rearrangements for clinical diagnostics and genetic research. Am. J. Hum. Genet. 2011;88:469–481. doi: 10.1016/j.ajhg.2011.03.013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Talkowski M.E., Ordulu Z., Pillalamarri V., Benson C.B., Blumenthal I., Connolly S., Hanscom C., Hussain N., Pereira S., Picker J. Clinical diagnosis by whole-genome sequencing of a prenatal sample. N. Engl. J. Med. 2012;367:2226–2232. doi: 10.1056/NEJMoa1208594. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Talkowski M.E., Rosenfeld J.A., Blumenthal I., Pillalamarri V., Chiang C., Heilbut A., Ernst C., Hanscom C., Rossin E., Lindgren A.M. Sequencing chromosomal abnormalities reveals neurodevelopmental loci that confer risk across diagnostic boundaries. Cell. 2012;149:525–537. doi: 10.1016/j.cell.2012.03.028. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Lindgren A.M., Hoyos T., Talkowski M.E., Hanscom C., Blumenthal I., Chiang C., Ernst C., Pereira S., Ordulu Z., Clericuzio C. Haploinsufficiency of KDM6A is associated with severe psychomotor retardation, global growth restriction, seizures and cleft palate. Hum. Genet. 2013;132:537–552. doi: 10.1007/s00439-013-1263-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Chiang C., Jacobsen J.C., Ernst C., Hanscom C., Heilbut A., Blumenthal I., Mills R.E., Kirby A., Lindgren A.M., Rudiger S.R. Complex reorganization and predominant non-homologous repair following chromosomal breakage in karyotypically balanced germline rearrangements and transgenic integration. Nat. Genet. 2012;44:390–397. doi: 10.1038/ng.2202. S1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Talkowski M.E., Maussion G., Crapper L., Rosenfeld J.A., Blumenthal I., Hanscom C., Chiang C., Lindgren A., Pereira S., Ruderfer D. Disruption of a large intergenic noncoding RNA in subjects with neurodevelopmental disabilities. Am. J. Hum. Genet. 2012;91:1128–1134. doi: 10.1016/j.ajhg.2012.10.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Vergult S., Van Binsbergen E., Sante T., Nowak S., Vanakker O., Claes K., Poppe B., Van der Aa N., van Roosmalen M.J., Duran K. Mate pair sequencing for the detection of chromosomal aberrations in patients with intellectual disability and congenital malformations. Eur. J. Hum. Genet. 2013 doi: 10.1038/ejhg.2013.220. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Stephens P.J., McBride D.J., Lin M.L., Varela I., Pleasance E.D., Simpson J.T., Stebbings L.A., Leroy C., Edkins S., Mudie L.J. Complex landscapes of somatic rearrangement in human breast cancer genomes. Nature. 2009;462:1005–1010. doi: 10.1038/nature08645. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Campbell P.J., Yachida S., Mudie L.J., Stephens P.J., Pleasance E.D., Stebbings L.A., Morsberger L.A., Latimer C., McLaren S., Lin M.L. The patterns and dynamics of genomic instability in metastatic pancreatic cancer. Nature. 2010;467:1109–1113. doi: 10.1038/nature09460. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Schluth-Bolard C., Labalme A., Cordier M.P., Till M., Nadeau G., Tevissen H., Lesca G., Boutry-Kryza N., Rossignol S., Rocas D. Breakpoint mapping by next generation sequencing reveals causative gene disruption in patients carrying apparently balanced chromosome rearrangements with intellectual deficiency and/or congenital malformations. J. Med. Genet. 2013;50:144–150. doi: 10.1136/jmedgenet-2012-101351. [DOI] [PubMed] [Google Scholar]
- 29.Stephens P.J., Greenman C.D., Fu B., Yang F., Bignell G.R., Mudie L.J., Pleasance E.D., Lau K.W., Beare D., Stebbings L.A. Massive genomic rearrangement acquired in a single catastrophic event during cancer development. Cell. 2011;144:27–40. doi: 10.1016/j.cell.2010.11.055. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Hanahan D., Weinberg R.A. Hallmarks of cancer: the next generation. Cell. 2011;144:646–674. doi: 10.1016/j.cell.2011.02.013. [DOI] [PubMed] [Google Scholar]
- 31.Kamalakaran S., Varadan V., Janevski A., Banerjee N., Tuck D., McCombie W.R., Dimitrova N., Harris L.N. Translating next generation sequencing to practice: opportunities and necessary steps. Mol. Oncol. 2013;7:743–755. doi: 10.1016/j.molonc.2013.04.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Meyer L.R., Zweig A.S., Hinrichs A.S., Karolchik D., Kuhn R.M., Wong M., Sloan C.A., Rosenbloom K.R., Roe G., Rhead B. The UCSC Genome Browser database: extensions and updates 2013. Nucleic Acids Res. 2013;41(Database issue):D64–D69. doi: 10.1093/nar/gks1048. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Altschul S.F., Gish W., Miller W., Myers E.W., Lipman D.J. Basic local alignment search tool. J. Mol. Biol. 1990;215:403–410. doi: 10.1016/S0022-2836(05)80360-2. [DOI] [PubMed] [Google Scholar]
- 34.Kent W.J. BLAT—the BLAST-like alignment tool. Genome Res. 2002;12:656–664. doi: 10.1101/gr.229202. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Kent W.J., Sugnet C.W., Furey T.S., Roskin K.M., Pringle T.H., Zahler A.M., Haussler D. The human genome browser at UCSC. Genome Res. 2002;12:996–1006. doi: 10.1101/gr.229102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Rehm H.L., Bale S.J., Bayrak-Toydemir P., Berg J.S., Brown K.K., Deignan J.L., Friez M.J., Funke B.H., Hegde M.R., Lyon E., Working Group of the American College of Medical Genetics and Genomics Laboratory Quality Assurance Commitee ACMG clinical laboratory standards for next-generation sequencing. Genet. Med. 2013;15:733–747. doi: 10.1038/gim.2013.92. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Xuan J., Yu Y., Qing T., Guo L., Shi L. Next-generation sequencing in the clinic: promises and challenges. Cancer Lett. 2013;340:284–295. doi: 10.1016/j.canlet.2012.11.025. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Kearney H.M., South S.T., Wolff D.J., Lamb A., Hamosh A., Rao K.W., Working Group of the American College of Medical Genetics American College of Medical Genetics recommendations for the design and performance expectations for clinical genomic copy number microarrays intended for use in the postnatal setting for detection of constitutional abnormalities. Genet. Med. 2011;13:676–679. doi: 10.1097/GIM.0b013e31822272ac. [DOI] [PubMed] [Google Scholar]
- 39.Cooley L.D., Lebo M., Li M.M., Slovak M.L., Wolff D.J., Working Group of the American College of Medical Genetics and Genomics (ACMG) Laboratory Quality Assurance Committee American College of Medical Genetics and Genomics technical standards and guidelines: microarray analysis for chromosome abnormalities in neoplastic disorders. Genet. Med. 2013;15:484–494. doi: 10.1038/gim.2013.49. [DOI] [PubMed] [Google Scholar]
- 40.South S.T., Lee C., Lamb A.N., Higgins A.W., Kearney H.M., Working Group for the American College of Medical Genetics and Genomics Laboratory Quality Assurance Committee ACMG Standards and Guidelines for constitutional cytogenomic microarray analysis, including postnatal and prenatal applications: revision 2013. Genet. Med. 2013;15:901–909. doi: 10.1038/gim.2013.129. [DOI] [PubMed] [Google Scholar]
- 41.MacDonald J.R., Ziman R., Yuen R.K., Feuk L., Scherer S.W. The Database of Genomic Variants: a curated collection of structural variation in the human genome. Nucleic Acids Res. 2014;42(Database issue):D986–D992. doi: 10.1093/nar/gkt958. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Firth H.V., Richards S.M., Bevan A.P., Clayton S., Corpas M., Rajan D., Van Vooren S., Moreau Y., Pettett R.M., Carter N.P. Decipher: Database of chromosomal imbalance and phenotype in humans using ensembl resources. Am. J. Hum. Genet. 2009;84:524–533. doi: 10.1016/j.ajhg.2009.03.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Martínez-Fundichely A., Casillas S., Egea R., Ràmia M., Barbadilla A., Pantano L., Puig M., Cáceres M. InvFEST, a database integrating information of polymorphic inversions in the human genome. Nucleic Acids Res. 2014;42(Database issue):D1027–D1032. doi: 10.1093/nar/gkt1122. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Palmer E., Speirs H., Taylor P.J., Mullan G., Turner G., Einfeld S., Tonge B., Mowat D. Changing interpretation of chromosomal microarray over time in a community cohort with intellectual disability. Am. J. Med. Genet. A. 2014;164A:377–385. doi: 10.1002/ajmg.a.36279. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.