Skip to main content
. 2010 Nov 20;39(Database issue):D383–D391. doi: 10.1093/nar/gkq1170

Table 1.

Criteria for predicting non-B DNA-forming motifs in non-B DB

DNA feature Search criteria Subset of ‘DNA feature’ forming non-B DNA Search criteria for ‘Subset of DNA feature’
Inverted repeat Repeat: 10–100 nt Cruciform motif Repeat: 10–100 nt
Spacer: 0–100 nt Spacer: 0–3 nt
Mirror repeat Repeat: 10–100 nt Triplex motif Repeat: 10–100 R
or Y nt
Spacer: 0–100 nt Spacer: 0–8 nt
Direct repeat Repeat: 10–50 nt Slipped motif Repeat: 10–50 nt
Spacer: 0–5 nt Spacer: 0 nt
Z-DNA repeat ≥5 units of CG/TG or CG/CA repeats Whole set As per the whole set
G-quadruplex forming repeat Four identical blocks of (3–7) G nt, each block separated by 1–7 nt Whole set As per the whole set
A-phased repeat ≥3 runs of A-tracts with 10-bp phasing Whole set As per the whole set

Inverted repeat: a pair of DNA sequences, each 10–100 nt in length and separated by a spacer of 0–100 nt, whose sequence composition on the same strand of DNA is such that the bases of the first repeat, when read in the 5′→3′ orientation, are complementary to those of the second repeat read in the 3′→5′ orientation. The term ‘complementary’ refers to the Watson–Crick hydrogen bonding scheme, whereby A only pairs with T and C only pairs with G. Only perfect inverted repeats that conform to this Watson–Crick pairing scheme are considered.

Cruciform motif: the subset of inverted repeat sequences in which the ‘Spacer’ comprises 0–3 bases; due to their proximity, this subset of inverted repeat sequences may fold-back and form intramolecular, antiparallel, double helices stabilized by Watson–Crick hydrogen bonds, i.e. a cruciform structure (1,34).

Mirror repeat: a pair of DNA sequences, each 10–100 nt in length and separated by a spacer of 0–100 nt, whose sequence composition on the same strand of DNA is such that the bases of the first repeat, when read in the 5′→3′ orientation, are identical to those of the second repeat read in the 3′→5′ orientation (palindrome); only perfectly matching repeats are included.

Triplex motif: the subset of mirror repeat sequences comprising only purines (R = A and G) [or pyrimidines (Y = C and T)] on the same strand of DNA, and which are separated by few (0–8) nt (‘Spacer’). These motifs are able to form various intramolecular three-stranded (triplex, H-DNA) isoforms stabilized by Hoogsteen hydrogen bonds (1,52,53). Only R•Y-containing mirror repeats that may yield A:A•T and G:G•C base triplets (colon indicates Hoogsteen hydrogen bonded bases; dot indicates Watson–Crick hydrogen bonded bases) for the R:R•Y type of intramolecular triplexes and T:A•T and C+:G•C triplets for the Y:R•Y type of intramolecular triplexes are included since these are considered the most stable triplet combinations.

Direct repeat: two tracts of DNA, each comprising 10–50 nt and separated by 0–5 nt, having the same sequence composition.

Slipped motif: the subset of direct repeat sequences without a spacer (tandem repeats); when aligned in an out-of-register fashion, tandem repeats may give rise to single-stranded loops and/or hairpins (1).

Z-DNA motif: five or more tandem repeats, each comprising an alternating pyrimidine–purine dinucleotide motif, in which the pattern YG is maintained on at least one of the DNA strands; examples include (CG•CG)6, (CA•TG)5 and [(TG)3(CG)4•(CG)4(CA)3]; these motifs may adopt the left-handed Z-DNA conformation (3,54).

G-quadruplex-forming repeat: four blocks, each containing the same number (n) of G bases (n can vary from 3 to 7), on the plus or minus strand, separated by 1–7 nt; this type of DNA sequence may adopt quadruplex structures (2); overlapping tracts of four G-blocks are also considered.

A-phased repeat: three runs of A bases (A-tracts) in phase with the helical pitch of the DNA double-helix, i.e. 10 bp; an A-tract is defined as a set of A•T base-pairs without a TpA step (47,55–57); three or more tracts of A3–7, T3–7, AAATTT, AAATTTT and AAAATTT (in any combination) on the plus or minus strand, whose centers are separated by 10 bases, are considered; since A-tracts induce static bends in the DNA double helix, the overall DNA superhelix is expected to display either a left-handed or a right-handed writhe (47,55–57); as mentioned, all the search criteria used herein do not allow for interruptions in the repeats and no thermodynamic information was factored-in in the algorithms used.