. 2019 Nov 20;20:246. doi: 10.1186/s13059-019-1828-7

Table 2.

Glossary. Here, positive (P) or negative (N) describes the SV detection (or SV calling), and true (T) or false (F) describes if the calling was correct. Thus, SVs are true positive (TP) if they are called or false negatives (FN) if they are not called but present in the sample. Conversely, SVs that are not in the sample are true negatives (TN) if they are not called or false positives (FP) if they are called

Word	Definition
Accuracy	Proportion of correctly identified events (T) to the overall events: (TP + TN)/(TP + TN + FP + FN).
Breakpoints	Positions on the genome denoting the start and end of SVs relative to the reference genome.
Contigs	Contiguous sequence stretches assembled from reads.
De Bruijn graph	Directed graph consisting of nodes with exactly n incoming and n outgoing edges. In genome assemblies, a de Bruijn graph is built where the nodes are k-mers (sequences of length k) and the edges correspond to the overlap on k − 1 bases between nodes.
String graph-based assembly	Similar method to De Bruijn graph-based assembly, but in this case, the overlaps between all read pairs (instead of k-mers) are computed to construct a string graph based on the overlaps.
Insert size	The distance between the two paired-end reads.
Overhang	Portion of a mapped read that cannot be aligned and thus could indicate a structural variation.
Phasing	The identification of two or more heterozygous variations are co-occurring on the same or different DNA molecule.
Precision (or positive predictive value)	Proportion of predictions (FP + TP) that are correct (TP).
Recall (or sensitivity or true-positive rate)	Proportion of the total positives (FN + TP) that were correctly identified (TP).
Scaffold	Connected contiguous sequence stretches, with unresolved sequence stretches in between.
Split reads	Reads containing parts that map in different loci on the reference genome. They are found by splitting the read in sub-segments, align individually each sub-segment, and then grouping sub-fragments from one read.
Tandem sequence	A specific type of repetitive region that was repeated directly adjacent to each other.