Skip to main content
. 2007 Mar 21;81(11):5617–5627. doi: 10.1128/JVI.01405-06

TABLE 2.

Consolidated data obtained for the consensus motifs from the databases of sequences flanking the integration sites and randomly picked human sequences of the same lengtha

Motif Sequence Width (bases) No. of sites Avg no. of occurrences
1 GGCGCGCGCCTGTAATCCCAGCACCTCGGGAGGCCGAGGCGGGGGGATCA 50 500 1.17
2 CCCCGGGTGGCGGGGATTGCAGGGATCTGCGATCACGCCAAGC 43 500 1.17
3 CCAGCCTGGGCAACAGAGTGAGACCCCGTCT 31 461 1.07
4 TGCCTCAGCCTCCCAAATAGCTGGGATTACAGGCGTGAGCCACCACGCCC 50 450 0.99
5 AGACCAGCCTGGGCAACATAGTGAAACCCCGTCTCTACAAAAAAAAAAAA 50 450 0.99
6 GCAGTGGCGCGATCTCGGCTCACTGCAACCTCCGCCTCCCGGGTTCAAGC 50 348 0.77
a

HIV-1 integration sequences were downloaded from the public nucleotide database as reported by Schröder et al. (47) and analyzed by MEME (3). Consensuses were calculated according to the type of sequence, the number of sequences in the set of data, the weight assigned to each sequence (=1), the minimum width of a consensus (5 bp), the maximum width of a consensus (50 bp), the number of times a consensus is expected to be present in a single sequence (zero, one, or more than one time per sequence), and finally the number of sequences found in the total set of data. A position-specific probability matrix was then plotted, and the consensus sequence was determined accordingly and is presented in bold in rows 1 through 3. A similar analysis was also carried out for sequences picked randomly from the human genome such that the length of each sequence was 2,000 bp. The results are given in rows 4 through 6. The motifs obtained in the sequences flanking integration sites are significantly different from those obtained from the randomly picked sequences from the human genome. The average number of occurrences of the given motif per sequence should be noted. The average number of occurrences was obtained by dividing the total number of occurrences by 429 (for rows 1 through 3) or 452 (for rows 4 to 6), the total number of sequences used for analysis.