Table 1.
Average values per 1 Mb surrounding integration sitec |
|||||||
---|---|---|---|---|---|---|---|
Virus | Target cell | Set name | Total sequence readsa | Unique integration sitesb | % G/C content/Mb | No. Transc. units/Mb | No. CpG islands/Mb |
HIV | Activated CD4+ T cell | Activated | 1183 | 524 | 46.7*** | 9.5*** | 60.9*** |
HIV | Resting CD4+ T cell | Resting | 1955 | 947 | 44.5*** | 6.7*** | 40.0*** |
HIV | Activated CD4+ T cell | Activated+dN | 1500 | 663 | 47.3*** | 9.5** | 67.0*** |
HIV | Resting CD4+ T cell | Resting+dN | 1076 | 527 | 44.4*** | 6.9** | 38.0*** |
Transc., transcription.
The number of sequences recovered by pyrosequencing that contained the proper barcode and long terminal repeat (LTR) primer.
The number of total sequence reads (a) that had a single best match to the human genome of >98% identity, where the terminal viral sequence (5′-CA-3′) is within 3 bp of the high quality match and where all duplicate integration sites were condensed into a single entry.
The average values of ‘% G/C content/Mb’, ‘No. Transc. units/Mb’ and ‘No. CpG islands/Mb’ correspond to data sets used to generate heatmap tiles in Fig. 4 for ‘GC content, 1 Mb’, ‘Expression density, Unigene, gene density, 1 Mb’ and ‘CpG Islands, Density, 1 Mb’, respectively.
*P < 0.05.
P < 0.01.
P < 0.001.