Table 1.
DNA feature | Search criteria | Subset of ‘DNA feature’ forming non-B DNA | Search criteria for ‘Subset of DNA feature’ |
---|---|---|---|
Inverted repeat | Repeat: 10–100 nt | Cruciform motif | Repeat: 10–100 nt |
Spacer: 0–100 nt | Spacer: 0–3 nt | ||
Mirror repeat | Repeat: 10–100 nt | Triplex motif | Repeat: 10–100 R |
or Y nt | |||
Spacer: 0–100 nt | Spacer: 0–8 nt | ||
Direct repeat | Repeat: 10–50 nt | Slipped motif | Repeat: 10–50 nt |
Spacer: 0–5 nt | Spacer: 0 nt | ||
Z-DNA repeat | ≥5 units of CG/TG or CG/CA repeats | Whole set | As per the whole set |
G-quadruplex forming repeat | Four identical blocks of (3–7) G nt, each block separated by 1–7 nt | Whole set | As per the whole set |
A-phased repeat | ≥3 runs of A-tracts with 10-bp phasing | Whole set | As per the whole set |
Inverted repeat: a pair of DNA sequences, each 10–100 nt in length and separated by a spacer of 0–100 nt, whose sequence composition on the same strand of DNA is such that the bases of the first repeat, when read in the 5′→3′ orientation, are complementary to those of the second repeat read in the 3′→5′ orientation. The term ‘complementary’ refers to the Watson–Crick hydrogen bonding scheme, whereby A only pairs with T and C only pairs with G. Only perfect inverted repeats that conform to this Watson–Crick pairing scheme are considered.
Cruciform motif: the subset of inverted repeat sequences in which the ‘Spacer’ comprises 0–3 bases; due to their proximity, this subset of inverted repeat sequences may fold-back and form intramolecular, antiparallel, double helices stabilized by Watson–Crick hydrogen bonds, i.e. a cruciform structure (1,34).
Mirror repeat: a pair of DNA sequences, each 10–100 nt in length and separated by a spacer of 0–100 nt, whose sequence composition on the same strand of DNA is such that the bases of the first repeat, when read in the 5′→3′ orientation, are identical to those of the second repeat read in the 3′→5′ orientation (palindrome); only perfectly matching repeats are included.
Triplex motif: the subset of mirror repeat sequences comprising only purines (R = A and G) [or pyrimidines (Y = C and T)] on the same strand of DNA, and which are separated by few (0–8) nt (‘Spacer’). These motifs are able to form various intramolecular three-stranded (triplex, H-DNA) isoforms stabilized by Hoogsteen hydrogen bonds (1,52,53). Only R•Y-containing mirror repeats that may yield A:A•T and G:G•C base triplets (colon indicates Hoogsteen hydrogen bonded bases; dot indicates Watson–Crick hydrogen bonded bases) for the R:R•Y type of intramolecular triplexes and T:A•T and C+:G•C triplets for the Y:R•Y type of intramolecular triplexes are included since these are considered the most stable triplet combinations.
Direct repeat: two tracts of DNA, each comprising 10–50 nt and separated by 0–5 nt, having the same sequence composition.
Slipped motif: the subset of direct repeat sequences without a spacer (tandem repeats); when aligned in an out-of-register fashion, tandem repeats may give rise to single-stranded loops and/or hairpins (1).
Z-DNA motif: five or more tandem repeats, each comprising an alternating pyrimidine–purine dinucleotide motif, in which the pattern YG is maintained on at least one of the DNA strands; examples include (CG•CG)6, (CA•TG)5 and [(TG)3(CG)4•(CG)4(CA)3]; these motifs may adopt the left-handed Z-DNA conformation (3,54).
G-quadruplex-forming repeat: four blocks, each containing the same number (n) of G bases (n can vary from 3 to 7), on the plus or minus strand, separated by 1–7 nt; this type of DNA sequence may adopt quadruplex structures (2); overlapping tracts of four G-blocks are also considered.
A-phased repeat: three runs of A bases (A-tracts) in phase with the helical pitch of the DNA double-helix, i.e. 10 bp; an A-tract is defined as a set of A•T base-pairs without a TpA step (47,55–57); three or more tracts of A3–7, T3–7, AAATTT, AAATTTT and AAAATTT (in any combination) on the plus or minus strand, whose centers are separated by 10 bases, are considered; since A-tracts induce static bends in the DNA double helix, the overall DNA superhelix is expected to display either a left-handed or a right-handed writhe (47,55–57); as mentioned, all the search criteria used herein do not allow for interruptions in the repeats and no thermodynamic information was factored-in in the algorithms used.