Skip to main content
Proceedings of the National Academy of Sciences of the United States of America logoLink to Proceedings of the National Academy of Sciences of the United States of America
. 2011 Oct 10;108(42):17360–17365. doi: 10.1073/pnas.1109272108

Hydroxyurea induces de novo copy number variants in human cells

Martin F Arlt a, Alev Cagla Ozdemir a, Shanda R Birkeland b, Thomas E Wilson a,b, Thomas W Glover a,1
PMCID: PMC3198378  PMID: 21987784

Abstract

Copy number variants (CNVs) are widely distributed throughout the human genome, where they contribute to genetic variation and phenotypic diversity. Spontaneous CNVs are also a major cause of genetic and developmental disorders and arise frequently in cancer cells. As with all mutation classes, genetic and environmental factors almost certainly increase the risk for new and deleterious CNVs. However, despite the importance of CNVs, there is limited understanding of these precipitating risk factors and the mechanisms responsible for a large percentage of CNVs. Here we report that low doses of hydroxyurea, an inhibitor of ribonucleotide reductase and an important drug in the treatment of sickle cell disease and other diseases induces a high frequency of de novo CNVs in cultured human cells that resemble pathogenic and aphidicolin-induced CNVs in size and breakpoint structure. These CNVs are distributed throughout the genome, with some hotspots of de novo CNV formation. Sequencing revealed that CNV breakpoint junctions are characterized by short microhomologies, blunt ends, and short insertions. These data provide direct experimental support for models of replication-error origins of CNVs and suggest that any agent or condition that leads to replication stress has the potential to induce deleterious CNVs. In addition, they point to a need for further study of the genomic consequences of the therapeutic use of hydroxyurea.


Copy number variants (CNVs), defined as submicroscopic deletions or duplications of as few as 50 bp to more than a megabase, are distributed throughout the human genome (16). With many thousands of normal polymorphic CNVs now described in healthy individuals (79), it is clear that human genetic variation is influenced by large-scale structural changes much more than previously believed. It has also become clear that many CNVs have deleterious consequences. Spontaneous CNVs are a very important and frequent cause of genetic and developmental disorders, including intellectual disability, neuropsychiatric disorders, and structural birth defects (1016), and their frequency suggests a high de novo mutation rate.

Despite their importance, there is limited understanding of how many CNVs arise and the risk factors involved. Recurrent CNVs are observed at regions flanked by large segmental duplications and arise by nonallelic homologous recombination during meiosis (4). However, nonrecurrent CNVs are observed throughout the genome in regions lacking such segmental duplications, and the mechanisms and risk factors involved in their formation are poorly understood. We previously demonstrated that low doses of the DNA polymerase inhibitor aphidicolin (APH) induced de novo CNVs at a high frequency (17, 18). These CNVs resembled both normal polymorphic and de novo, nonrecurrent CNVs seen in humans in size and structure. However, these studies did not elucidate whether CNV induction could be attributed to generalized replication stress or was specific to APH-mediated polymerase inhibition. To answer this question, we performed a series of experiments using the mechanistically distinct and clinically relevant replication inhibitor hydroxyurea (HU). HU leads to replication stress via inhibition of ribonucleotide reductase and perturbation of nucleotide pools and replication fork progression (19). Chronic HU treatment also leads to increased expression of fetal hemoglobin, possibly through stimulation of cellular nitric oxide and cGMP signaling in erythroid progenitors (1923), which results in reduced sickling and amelioration of disease (24). As a result, HU-treated patients have fewer vasoocclusive events and require fewer transfusions and hospitalizations (24). HU treatment is therefore an effective drug for long-term management of sickle cell disease (21) and other disorders, including certain cancers, myeloproliferative disorders, thalassemias, and HIV infection (25). Here we examined the effects of therapeutic doses of HU on CNV formation in normal human fibroblasts.

Results

Genomic Effects of HU in Human Fibroblasts.

Normal human telomerase reverse transcriptase (hTERT)-immortalized human fibroblasts were cultured in the presence of HU, at doses equivalent to serum levels achieved in sickle cell patients (26, 27), or APH for 72 h before plating for clones. Expanded individual clones were subjected to CNV analysis using Illumina 1M SNP arrays (Fig. S1). De novo CNVs are defined as a gain or loss detected in a clone that is absent from the parental cell population. De novo CNVs can arise as a result of drug exposure or can arise spontaneously during cell culture. The parental cell population had been expanded from a single clone to reduce the number of potentially mosaic CNVs within it. Fifteen independent clones from untreated, 100 μM, 200 μM, and 300 μM HU-treated cells and 14 positive control clones of 0.4 μM APH-treated cells were analyzed. De novo CNVs were identified in untreated clones at a frequency of 0.80 CNVs per clone and in APH-treated clones at a frequency of 2.79 CNVs per clone (P < 0.00001) (Fig. 1A). HU treatment induced a significant number of de novo CNVs, ranging from 1.60 CNVs per clone in 100 μM and 300 μM HU-treated cells to 2.87 CNVs per clone in 200 μM HU-treated cells (P = 0.023 and P < 0.000001, respectively).

Fig. 1.

Fig. 1.

HU induces de novo CNVs in normal human fibroblasts. (A) Incidence of de novo CNVs in normal, hTERT-immortalized human fibroblasts treated with 100–300 μM HU or 0.4 μM APH for 72 h. Fifteen independent clones each of untreated, 100 μM, 200 μM, and 300 μM HU-treated cells and 14 clones of APH-treated cells were analyzed. Error bars indicate SE. (B) Incidence of de novo CNVs in cells cultured in the presence or absence of HU or APH for varying lengths of time corresponding to two population doublings under each treatment (NT, 48 h; APH, 67 h; HU, 120 h). De novo CNVs from 10 independent clones each of untreated, 100 μM HU-treated, and 0.4 μM APH-treated cells and nine clones of 200 μM and 300 μM HU-treated cells were analyzed. Error bars indicate SE. (C) Colony-forming ability of HU-treated cells was reduced compared with untreated cells. Error bars indicate SD. (D) Poisson distributions illustrating the differences in de novo CNV incidence between HU-treated (red) and untreated (blue) cells. Graph includes all doses of HU from both experiments summarized in A and B. (E) Size distribution of CNVs. Graph showing the fraction of CNVs by size for two treatment groups, 0.4 μM APH (blue circles), 100–300 μM HU (red squares), and untreated (green triangles).

During these experiments, we noted a large effect of HU on cell growth rate. Direct measurement revealed that the population doubling time of HU-treated cells was approximately fourfold slower than in untreated cells. Thus, cells undergoing the fixed 72-h treatment went through varying numbers of cell divisions. During this period, control cells underwent 2.5 cell doublings, whereas HU-treated cells failed to complete a single doubling.

To allow a more direct comparison of CNV frequency per cell division, we repeated the experiment such that all treatment groups were cultured for the time needed to undergo two cell doublings. We analyzed 10 independent clones from untreated, 100 μM HU-treated, and 0.4 μM APH-treated cells, as well as nine clones from 200 μM and 300 μM HU-treated cells. Untreated clones again showed a low background frequency of 0.50 de novo CNVs per clone (Fig. 1B). After two population doublings with HU treatment, the average number of CNVs increased three- to fourfold over controls, ranging from 1.56 CNVs per clone in 200 μM to 2.10 CNVs per clone in 100 μM and 300 μM HU-treated cells (P = 0.011 and P < 0.001, respectively). Under these conditions, the colony-forming ability of HU-treated cells was significantly reduced compared with untreated cells (P < 10−8) (Fig. 1C). Because the reduction in cell viability may be due, in part, to negative selection against cells with chromosome aberrations, the observed de novo CNV frequency may be an underestimate.

Comparison of all HU-treated and untreated clones from both experiments yielded a significance of P < 0.000001 and clearly demonstrated that HU shifted the mean rate of CNV occurrence according to the Poisson distribution (Fig. 1D). Thus, HU is a potent inducer of CNVs. CNVs consisted of a mix of both deletions and duplications. Nine of 17 CNVs (53%) from untreated cells were of the deletion type. The proportion of deletions was slightly higher in both APH-treated clones (58 of 71; 82%) and HU-treated clones (94 of 145; 65%).

Although HU and APH perturb DNA replication via different mechanisms, both agents induce CNVs with a similar size distribution (Fig. 1E). HU-induced deletions and duplications were generally large, ranging in size from 1 kb to 35.7 Mb, with a median size of 132 kb (median deletion size, 126 kb; median duplication size, 166 kb). Consistent with our previous report (18), APH-induced deletions and duplications were similar, ranging from 1 kb to 80.3 Mb, with a median size of 165 kb (median deletion size, 163 kb; median duplication size, 191 kb). These sizes are similar to those seen in spontaneous CNVs arising in untreated control cells, which ranged from 19 kb to 1.5 Mb, with a median size of 187 kb (median deletion size, 196 kb; median duplication size, 166 kb).

Approximately 82% (122 of 149) of nonoverlapping regions containing de novo CNVs spanned or interrupted genes (Dataset S1). We used a simulation to estimate the probability that observed CNV regions were enriched for genes (Materials and Methods). The simulation yielded a normal distribution, with 70% ± 4% of random regions crossing a gene (Fig. S2). The observed frequency of 82% was just outside three SDs of the simulation mean (P = 0.001 from the normal distribution). Although this result suggested a slight enrichment for CNVs at genes, it is likely that the simulation underestimated the true random probability. This would occur if CNVs observed by microarray analysis were restricted to a smaller portion of the genome than that sampled in the simulation and if the regions not well sampled by microarray were gene poor, both of which are expected for repetitive genome regions. Accordingly, we do not consider the observed CNVs to show a strong selection for or against overlapping genes.

Hotspots of De Novo CNV Formation.

De novo CNVs detected in HU- and APH-treated cells and untreated controls were distributed throughout the genome, with most arising in distinct, nonoverlapping regions (Fig. 2). However, superimposed on this pattern were hotspots where different overlapping CNVs were induced by both agents, implicating a common mechanism in their formation (Figs. 2 and 3, Table S1, and Dataset 1). The most apparent of these hotspots was in a 1.2-Mb window at 3q13.31, adjacent to the LSAMP gene, where we detected 20 CNVs in 73 HU-treated clones, 11 CNVs in 24 APH-treated clones, and 2 CNVs in 25 untreated clones (Fig. 3). Clustering of CNVs induced by both agents was also found at 7q11.2 in the AUTS2 gene (Fig. 3). Notably, similar deletions at the 3q13.31 site are frequent in primary osteosarcomas (2830) and cancer cell lines (31), whereas a small number of de novo CNVs in the AUTS2 gene have been reported in individuals with neurological disorders, including autism (3234). It is possible that there is a selective advantage in culture for cells with CNVs in these regions. Nevertheless, each CNV within these hotspots had unique boundaries, indicating that each one arose independently, as opposed to a single CNV being selected in culture, and supporting the hypothesis that these regions are especially sensitive to replication stress. These data illustrate an overlap between experimentally and naturally occurring de novo CNVs and suggest a common mechanistic association with replication stress in their formation.

Fig. 2.

Fig. 2.

Locations of replication stress-induced CNVs. Red circles indicate HU-induced CNVs, blue squares indicate APH-induced CNVs, and green triangles indicate spontaneously arising CNVs in untreated cells. Markers above and below chromosomes represent duplications and deletions, respectively. Three large terminal deletions were also observed and are shown, although such events were considered to be a different class of alteration than smaller CNVs. Ideograms were adapted from the University of California, Santa Cruz genome browser (http://genome.ucsc.edu) (58). Precise coordinates for all de novo CNVs are listed in Dataset S1.

Fig. 3.

Fig. 3.

Clustering of HU- and APH-induced CNVs at 3q13.31 and 7q11.22. Clustering was defined as a region of the genome containing four or more overlapping or closely adjacent CNVs within a 2.5-Mb window (Materials and Methods). Although overlapping CNVs were found in these regions, all had distinct breakpoints.

Other clusters of four or more CNVs occurred at 1p31.3 (NEGR1), 1q44 (KIF26B, SMYD3), 7q21.11 (MAGI2, RPL13AP17), 10q11.23 (PRKG1), and 16q23.3 (WWOX). CNVs at 16q23.3 are within the highly expressed and molecularly defined common fragile site, FRA16D (35). The deletions at 7q21.11 are ≈1 Mb proximal to the defined location of the fragile site FRA7E. Fragile sites have also been reported at 3q13.3, 7q11.2, and 1q44 (36), but their exact genomic locations have not been determined.

Characterization of De Novo CNV Breakpoint Junctions.

To explore the mechanism of HU-induced CNV formation, we examined the sequences at 17 CNV breakpoint junctions from 13 deletions and 2 duplications in HU-treated cells and 5 breakpoint junctions from 4 deletions and 1 duplication in untreated cells (Fig. 4, Table 1, and Fig. S3). All five sequenced CNV junctions from untreated clones were characterized by short (2–5 bp) stretches of microhomology. Of the sequenced junctions from HU-treated clones, 8 of 17 (47%) were blunt ends (0 to 1 bp of homology) and 9 of 17 (53%) had short (2–8 bp) stretches of microhomology. Three of 17 HU-induced CNV junctions were from a single, complex CNV (Fig. 4). This CNV consisted of an inverted region flanked by deletions and a short, ectopically duplicated segment detected as an insertion at the deletion breakpoint junction. This insertion matches a sequence 9 bp upstream of the junction, aligned with the reverse complement strand, suggesting that this insertion is not the result of polymerase slippage. Instead, it seems that a template switch occurred in which the newly synthesized strand folded back and used itself as a template for replication, before switching further downstream to create a 48.3-kb deletion (Fig. 4). The breakpoint junctions on either end of this insertion were characterized by a single base pair of homology.

Fig. 4.

Fig. 4.

Example of a complex de novo CNV induced by HU treatment. (A) Breakpoint junction sequencing revealed that this CNV at 3q26.31 in clone H2C21, which was called as a deletion on the basis of array data, was in fact a complex rearrangement involving an inversion of 5013 bp (red arrow), flanked by 48,309-bp and 14-bp deletions (blue bars). One of the two resulting breakpoint junctions was characterized by an insertion of 7 bp. This insertion was identified as an inverted duplication of 7 bp inserted 9 bp downstream of its location in the reference genome (gray arrows). (B) Diagram of the putative replication path taken to generate this complex CNV. (C) CNV breakpoint junction sequences from this complex CNV. The strand of DNA is indicated as (+) or (-). The inverted duplication is highlighted in gray, whereas the original position of the duplicated sequence in the reference genome is underlined. Regions of homology at the junction are underlined and highlighted in yellow. The two breakpoint junctions from this duplication had only a single base of homology. The other junction had 2 bp of microhomology at the junction. Mate-pair sequence analysis of the parental hTRT-090 (43) revealed several constitutional CNVs with a similar structure, suggesting that the replication stress-induced CNVs observed in vitro closely model the processes that occur in vivo.

Table 1.

Characterization of CNV breakpoint junctions

Treatment Perfect homology at junction Frequency Fraction of total junctions
NT 0–1 bp 0/5 0.0
2–8 bp 5/5 1.0
>8 bp 0/5 0.0
Insertion (<8 bp) 0/5 0.0
HU 0–1 bp 8/17 0.47
2–8 bp 9/17 0.53
>8 bp 0/17 0.00
Insertion (<8 bp)* 1/17 0.06

*This insertion is an ectopic duplication of 7 bp within a complex CNV. The two breakpoint junctions of this insertion are included in the above values as well.

We were successful in sequencing the breakpoint junctions from 54.1% (20 of 37) of attempted CNVs. The success rate for deletions was 58.6% (17 of 29) and for duplications was 37.5% (3 of 8). It is possible that the breakpoints that were sequenced represent junctions with a simple structure that are easier to amplify. The CNVs for which breakpoint cloning failed may represent junctions with complex structures that are difficult to amplify, even using multiple primer sets flanking the breakpoints, as we have done. As such, we expect that our single complex CNV is an underrepresentation of the actual incidence of complex events in our samples.

The similarity in breakpoint junctions between spontaneous CNVs in untreated cells and HU-induced CNVs suggests that both of these types of de novo variants arise via a common mechanism and that exogenous replication stress is increasing the incidence of events that occur spontaneously at a baseline level. None of the junctions had long stretches of homology that would implicate homologous repair in their formation, thus indicating that these lesions are formed via a nonhomologous repair mechanism.

Discussion

These experiments show that HU, at doses equivalent to the peak serum levels achieved in sickle cell patients, is a potent inducer of CNVs in normal human cultured fibroblasts. The sizes and breakpoint junction sequences of HU-induced CNVs are consistent with the nonrecurrent class of de novo pathogenic CNVs (3741), a large class of normal human CNVs (57, 42), and APH-induced CNVs (17, 18, 43) (Fig. S4). These results strongly support a common mechanism mediated by replication stress for the formation of CNVs found in vivo and those induced in our experimental system.

Like many human CNVs and those induced by APH, HU-induced CNVs and de novo CNVs in untreated cells were characterized by short microhomologies at breakpoint junctions and include complex rearrangement events. The fact that all of these CNVs have similar breakpoint junctions suggests that CNVs in vivo and in vitro all arise via a replication stress-mediated process. Both end-joining and replication-dependent mechanisms have been proposed to explain the microhomologies seen at CNVs. Lee et al. (37) proposed a model, termed “fork stalling and template switching,” in which replication forks switch to another active fork. Hastings et al. (44) proposed a modification of this model, termed “microhomology-mediated break-induced replication,” that invokes template switching repair of single-sided DSBs formed at collapsed forks. An important feature of human CNVs that led to these models was the existence of complex CNVs with multiple junctions indicative of multiple template switching events. One of our sequenced, HU-induced CNVs proved to be similarly complex (Fig. 4), providing direct experimental support for these models.

Because these experiments were performed using an array-based technology, there is an inherent ascertainment bias against CNVs within highly repetitive regions. However, CNVs that extend from repetitive regions into nonrepetitive sequence are still readily detectible, and the presence of repeat elements within breakpoint regions should not interfere with our ability to amplify and sequence across breakpoint junctions. Indeed, there were several instances in which one or both junctions were located within LINE1, SINE, or other DNA elements. However, in none of these cases did both breakpoints of a CNV share a common type of repeat element, arguing against a nonallelic homologous recombination-based mechanism of formation.

Because meiotic cells do not replicate chromosomes, and nonhomologous repair is down-regulated during meiosis (45), constitutional human CNVs similar to those detected here are predicted to originate in mitotic cells in or leading to the germline (18). The mitotic cell origin hypothesis has important implications for the risk involved in CNV formation. For example, males complete ongoing mitotic divisions leading to mature germ cells throughout adulthood, whereas females do so during fetal development, thus predicting a male sex bias in risk for de novo, nonrecurrent CNVs, coupled with a possible age effect. In addition, agents like HU that perturb replication may be a factor in producing CNVs in the maternal grandchildren of females exposed during pregnancy. This hypothesis also predicts that CNVs will arise frequently in postzygotic cells, leading to somatic mosaicism within or between tissues. Indeed, substantial evidence exists for somatic mosaicism of pathogenic CNVs, such as in the NF1 and DMD genes (39, 46, 47), and for apparently benign CNVs in identical twins (48) and in different tissues within individuals (49, 50).

The overlap of some CNV hotspots with common fragile sites suggests a mechanistic link between the events leading to fragile site chromosome breaks and induction of some CNVs. However, de novo CNVs were not observed in or near other frequently observed common fragile sites, including FRA3B and FRA7G. This result could be due to selection pressures inherent to generating fibroblast clones harboring CNVs in these regions or, alternatively, a more complex relationship between fragile site instability and CNVs secondary to their common induction by replication stress. Both depletion of nucleotide pools by HU and polymerase inhibition by APH lead to reduced replication fork rate, fork stalling, and activation of dormant origins (5154). It has recently been shown that dormant origins are only activated in early replicating regions, whereas origin firing is suppressed by S-phase checkpoints in late-replicating regions of the genome (54). Late replication, failed activation of dormant origins, or a paucity of cell type-specific replication initiation sites, as recently described for the FRA3B locus (55), may be a common feature that leads to fork stalling, incomplete replication, and genomic instability of at least some fragile sites and the observed CNV hotspots.

HU is now the second agent shown to experimentally induce CNVs. Although HU and APH impair DNA replication via different mechanisms, both agents induce CNVs at a similar frequency and size distribution. Therefore, we can begin to define a class of potential risk factors for CNVs. Our results indicate that any agent or condition that leads to replication stress has the potential to induce deleterious, de novo CNVs. Unlike direct DNA-damaging agents, comprehensive genomic studies defining the effects of agents that cause replication stress are lacking. Our results demonstrate the importance of identifying and studying the effects of such agents on our genomes to better understand the risks of replication stress for de novo CNVs.

HU is approved by the US Food and Drug Administration for the treatment of adults with sickle cell disease and is in phase III clinical trials in infants and children. It has clear benefits for the treatment of sickle cell disease, as well as some types of cancer, myeloproliferative disorders, thalassemias, and HIV infection (25). The long-term effects of HU treatment were recently reported in a 17.5-y follow-up of adults undergoing HU therapy, with no difference in the incidence of stroke, organ dysfunction, infection, or cancer reported between treated individuals and controls (56). Although HU is well-tolerated and has low toxicity in patients, reproductive studies are limited, and the long-term effects of HU on the genomes of subsequent generations have not been evaluated. Thus, our observation that HU induces CNVs at concentrations equivalent to the peak serum levels achieved in sickle cell patients (26, 27) suggests a cautionary note regarding the therapeutic use of HU. Moreover, we note that HU induced de novo CNVs in cultured cells in one or two cell divisions, whereas patients are treated with HU for many years. It is not possible to directly extrapolate current findings to human CNV risk. However, these observations indicate that the intergenerational, germline effects of HU and other replication inhibitors should be determined to directly test the replication stress hypothesis for CNV formation in vivo and further assess the potential risk for submicroscopic genomic structural changes in the genomes of HU-treated patients and their future generations.

Materials and Methods

Generation of Normal Human Fibroblast Clones Containing CNVs.

All experiments were performed with an hTERT immortalized derivative of normal human fibroblast cell line HGMDFN090 (090), which was obtained from the Progeria Research Foundation (Peabody, MA) and previously described (18). Genomic DNA was prepared from cell lines using the Blood & Cell Culture DNA Mini Kit (Qiagen). Cells were grown in DMEM media supplemented with 15% FBS. To create replication stress-induced CNVs, cells were treated with HU or APH. In experiment 1, cells were treated with drugs for 72 h, followed by a 24-h recovery period before plating at low density for single-cell clones. In experiment 2, cells were treated for two population doublings (NT, 48 h; APH, 67 h; HU, 120 h) as determined from cell counts of parallel cultures, and plated at low density immediately after treatment. Cells were plated at a density of 100–500 cells per 100-mm culture dish and individual clones isolated using cloning rings after 7–10 d and expanded. In all cases, four separate culture flasks were treated in each of two experiments to ensure that any recurrent CNVs did not arise from the same original cell.

SNP Microarrays.

CNVs were detected using the 1M feature Illumina HumanOmni1-Quad BeadChip. Arrays were run by the University of Michigan DNA Sequencing Core, including determination of probe log R ratios and B allele frequencies. CNV detection was performed using our software platform, VAMP, as previously described (43). All clones were analyzed using the appropriate mixed parental cell population as the normalization reference. This approach routinely detects CNVs larger than 20 kb and can detect CNVs as small as ≈1 kb, depending on probe placement.

CNV Breakpoint Junctions.

CNV breakpoint junctions were amplified using the Expand Long Template PCR System (Roche Applied Science). For deletions, multiple PCR primer pairs were generated that flanked deletion breakpoints. For duplications, primers were designed within the duplicated region, directed outward, as described previously (18). PCR amplification generated a product that spanned the breakpoint junction. All products were then subjected to standard Sanger sequencing. The resulting sequence was compared with the reference genome (build hg18) to identify the breakpoint junctions.

Statistical Methods.

CNVs in our model system are relatively rare events, and therefore the numbers of CNVs per clone are expected to fit a Poisson distribution determined by the mean frequency of CNVs in all clones, an expectation confirmed by plotting the data as in Fig. 1D. Therefore, P values of treated vs. untreated samples were determined using the one-sided E test for comparing two Poisson mean rates (57).

To determine whether the observed clustering of CNVs within genome regions was nonrandom, we performed the Monte Carlo simulation summarized in Table S1. Mock CNVs of the same number and size as the observed set of CNVs were placed randomly throughout the genome with the following restrictions: (i) random CNVs were not allowed to fall within uncharacterized gaps in the reference sequence and on the arrays where experimental detection of CNVs was not possible, (ii) random CNVs had to be contained entirely within a single chromosome, and (iii) CNVs greater than 2.5 Mb (nine; 4% of all observed CNVs) were not included in the analysis because this small number of very large events had the potential to create overlap regions much larger than the hotspots we sought to define. CNV hotspots in either the observed or a simulation data set were identified as regions of the genome containing overlapping or closely adjacent CNVs. CNVs were considered to be closely adjacent if their nearest ends were separated by less than 250 Kb (i.e., less than 10% of the maximum allowed CNV size). A simulation of 10,000 iterations was performed on the combined APH plus HU CNV sets. The number of genome regions containing different numbers of clustered CNVs was determined for the actual data and each simulation iteration. Averaging over all iterations of the simulation provided λN, the mean expected number of regions for each CNV count, N. When no regions with N clustered CNVs were observed in the simulation, 1/10,000 = 0.0001 was taken as the maximal estimate of λN. The estimated probability of observing one or more regions with N clustered CNVs, P(>0), was calculated from the cumulative Poisson distribution as 1 − Poisson(0 | λN). More generally, the estimated probability of observing Robs or more regions with N clustered CNVs, P(Robs), was calculated as 1 − Poisson(Robs − 1 | λN). Using the more conservative estimate P(>0) and a cutoff of P = 0.01, this analysis indicated that observed genome regions containing four or more clustered CNVs were nonrandom (highlighted by shading in Dataset S1). Importantly, most observed CNVs (140 of 207; 68%) were not in hotpots and thus appear randomly distributed throughout the genome.

A second simulation was used to estimate the probability that observed CNV regions were enriched for genes. Here, overlapping CNVs were considered as a single genome region to prevent hotspots from having too large of an influence on the outcome. As above, events greater than 2.5 Mb were not considered, yielding 149 unique CNV regions. Regions were scored as positive if they crossed any gene in the National Center for Biotechnology Information RefSeq annotation. A 10,000-iteration simulation was performed by querying the fraction of randomly placed CNV regions of the same size as the observed set that crossed genes, with the same restrictions as for hotspot identification.

Supplementary Material

Supporting Information

Acknowledgments

We thank Sountharia Rajendran for technical assistance, Douglas Engel and John Moran for insightful discussions and comments on the manuscript, and Susan Dagenais and Robert Lyons in the University of Michigan DNA Sequencing Core for assistance with SNP arrays. This work was supported by National Institute of Environmental Health Sciences Grant RC1-ES018672 (to T.W.G. and T.E.W.) and by a research grant from the March of Dimes Foundation (to T.W.G.).

Footnotes

The authors declare no conflict of interest.

This article is a PNAS Direct Submission.

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1109272108/-/DCSupplemental.

References

  • 1.Iafrate AJ, et al. Detection of large-scale variation in the human genome. Nat Genet. 2004;36:949–951. doi: 10.1038/ng1416. [DOI] [PubMed] [Google Scholar]
  • 2.Redon R, et al. Global variation in copy number in the human genome. Nature. 2006;444:444–454. doi: 10.1038/nature05329. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Sebat J, et al. Large-scale copy number polymorphism in the human genome. Science. 2004;305:525–528. doi: 10.1126/science.1098918. [DOI] [PubMed] [Google Scholar]
  • 4.Sharp AJ, et al. Segmental duplications and copy-number variation in the human genome. Am J Hum Genet. 2005;77:78–88. doi: 10.1086/431652. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Conrad DF, et al. Wellcome Trust Case Control Consortium Origins and functional impact of copy number variation in the human genome. Nature. 2010;464:704–712. doi: 10.1038/nature08516. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Mills RE, et al. 1000 Genomes Project Mapping copy number variation by population-scale genome sequencing. Nature. 2011;470:59–65. doi: 10.1038/nature09708. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Korbel JO, et al. Paired-end mapping reveals extensive structural variation in the human genome. Science. 2007;318:420–426. doi: 10.1126/science.1149504. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Durbin RM, et al. 1000 Genomes Project Consortium A map of human genome variation from population-scale sequencing. Nature. 2010;467:1061–1073. doi: 10.1038/nature09534. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Sudmant PH, et al. 1000 Genomes Project Diversity of human copy number variation and multicopy genes. Science. 2010;330:641–646. doi: 10.1126/science.1197005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Girirajan S, Eichler EE. Phenotypic variability and genetic susceptibility to genomic disorders. Hum Mol Genet. 2010;19(R2):R176–R187. doi: 10.1093/hmg/ddq366. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Stankiewicz P, Lupski JR. Structural variation in the human genome and its role in disease. Annu Rev Med. 2010;61:437–455. doi: 10.1146/annurev-med-100708-204735. [DOI] [PubMed] [Google Scholar]
  • 12.Cook EH, Jr, Scherer SW. Copy-number variations associated with neuropsychiatric conditions. Nature. 2008;455:919–923. doi: 10.1038/nature07458. [DOI] [PubMed] [Google Scholar]
  • 13.Kirov G, et al. International Schizophrenia Consortium Wellcome Trust Case Control Consortium Support for the involvement of large copy number variants in the pathogenesis of schizophrenia. Hum Mol Genet. 2009;18:1497–1503. doi: 10.1093/hmg/ddp043. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Stankiewicz P, Beaudet AL. Use of array CGH in the evaluation of dysmorphology, malformations, developmental delay, and idiopathic mental retardation. Curr Opin Genet Dev. 2007;17:182–192. doi: 10.1016/j.gde.2007.04.009. [DOI] [PubMed] [Google Scholar]
  • 15.Zhang F, Gu W, Hurles ME, Lupski JR. Copy number variation in human health, disease, and evolution. Annu Rev Genomics Hum Genet. 2009;10:451–481. doi: 10.1146/annurev.genom.9.081307.164217. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Tam GW, Redon R, Carter NP, Grant SG. The role of DNA copy number variation in schizophrenia. Biol Psychiatry. 2009;66:1005–1012. doi: 10.1016/j.biopsych.2009.07.027. [DOI] [PubMed] [Google Scholar]
  • 17.Durkin SG, et al. Replication stress induces tumor-like microdeletions in FHIT/FRA3B. Proc Natl Acad Sci USA. 2008;105:246–251. doi: 10.1073/pnas.0708097105. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Arlt MF, et al. Replication stress induces genome-wide copy number changes in human cells that resemble polymorphic and pathogenic variants. Am J Hum Genet. 2009;84:339–350. doi: 10.1016/j.ajhg.2009.01.024. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Yarbro JW. Mechanism of action of hydroxyurea. Semin Oncol. 1992;19(3) Suppl 9:1–10. [PubMed] [Google Scholar]
  • 20.Atweh GF. Hydroxyurea in sickle cell disease: What will it take to change practice? Am J Hematol. 2010;85:401–402. doi: 10.1002/ajh.21733. [DOI] [PubMed] [Google Scholar]
  • 21.Ware RE, Aygun B. Advances in the use of hydroxyurea. Hematology Am Soc Hematol Educ Program. 2009:62–69. doi: 10.1182/asheducation-2009.1.62. [DOI] [PubMed] [Google Scholar]
  • 22.Scott JP. Hydroxurea and sickle cell disease: Its been a long, long time coming. Pediatr Blood Cancer. 2010;54:185–186. doi: 10.1002/pbc.22340. [DOI] [PubMed] [Google Scholar]
  • 23.Lou TF, Singh M, Mackie A, Li W, Pace BS. Hydroxyurea generates nitric oxide in human erythroid cells: Mechanisms for gamma-globin gene activation. Exp Biol Med (Maywood) 2009;234:1374–1382. doi: 10.3181/0811-RM-339. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Charache S, et al. Investigators of the Multicenter Study of Hydroxyurea in Sickle Cell Anemia Effect of hydroxyurea on the frequency of painful crises in sickle cell anemia. N Engl J Med. 1995;332:1317–1322. doi: 10.1056/NEJM199505183322001. [DOI] [PubMed] [Google Scholar]
  • 25.Kovacic P. Hydroxyurea (therapeutics and mechanism): Metabolism, carbamoyl nitroso, nitroxyl, radicals, cell signaling and clinical applications. Med Hypotheses. 2010;76:24–31. doi: 10.1016/j.mehy.2010.08.023. [DOI] [PubMed] [Google Scholar]
  • 26.Zimmerman SA, et al. Sustained long-term hematologic efficacy of hydroxyurea at maximum tolerated dose in children with sickle cell disease. Blood. 2004;103:2039–2045. doi: 10.1182/blood-2003-07-2475. [DOI] [PubMed] [Google Scholar]
  • 27.Flanagan JM, et al. Assessment of genotoxicity associated with hydroxyurea therapy in children with sickle cell anemia. Mutat Res. 2010;698:38–42. doi: 10.1016/j.mrgentox.2010.03.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Kresse SH, et al. LSAMP, a novel candidate tumor suppressor gene in human osteosarcomas, identified by array comparative genomic hybridization. Genes Chromosomes Cancer. 2009;48:679–693. doi: 10.1002/gcc.20675. [DOI] [PubMed] [Google Scholar]
  • 29.Yen CC, et al. Identification of chromosomal aberrations associated with disease progression and a novel 3q13.31 deletion involving LSAMP gene in osteosarcoma. Int J Oncol. 2009;35:775–788. doi: 10.3892/ijo_00000390. [DOI] [PubMed] [Google Scholar]
  • 30.Pasic I, et al. Recurrent focal copy-number changes and loss of heterozygosity implicate two noncoding RNAs and one tumor suppressor gene at chromosome 3q13.31 in osteosarcoma. Cancer Res. 2010;70:160–171. doi: 10.1158/0008-5472.CAN-09-1902. [DOI] [PubMed] [Google Scholar]
  • 31.Bignell GR, et al. Signatures of mutation and selection in the cancer genome. Nature. 2010;463:893–898. doi: 10.1038/nature08768. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Elia J, et al. Rare structural variants found in attention-deficit hyperactivity disorder are preferentially associated with neurodevelopmental genes. Mol Psychiatry. 2010;15:637–646. doi: 10.1038/mp.2009.57. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Glessner JT, et al. Autism genome-wide copy number variation reveals ubiquitin and neuronal genes. Nature. 2009;459:569–573. doi: 10.1038/nature07953. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Mefford HC, et al. Genome-wide copy number variation in epilepsy: Novel susceptibility loci in idiopathic generalized and focal epilepsies. PLoS Genet. 2010;6:e1000962. doi: 10.1371/journal.pgen.1000962. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Ried K, et al. Common chromosomal fragile site FRA16D sequence: Identification of the FOR gene spanning FRA16D and homozygous deletions and translocation breakpoints in cancer cells. Hum Mol Genet. 2000;9:1651–1663. doi: 10.1093/hmg/9.11.1651. [DOI] [PubMed] [Google Scholar]
  • 36.Mrasek K, et al. Global screening and extended nomenclature for 230 aphidicolin-inducible fragile sites, including 61 yet unreported ones. Int J Oncol. 2010;36:929–940. doi: 10.3892/ijo_00000572. [DOI] [PubMed] [Google Scholar]
  • 37.Lee JA, Carvalho CM, Lupski JR. A DNA replication mechanism for generating nonrecurrent rearrangements associated with genomic disorders. Cell. 2007;131:1235–1247. doi: 10.1016/j.cell.2007.11.037. [DOI] [PubMed] [Google Scholar]
  • 38.Vissers LE, et al. Rare pathogenic microdeletions and tandem duplications are microhomology-mediated and stimulated by local genomic architecture. Hum Mol Genet. 2009;18:3579–3593. doi: 10.1093/hmg/ddp306. [DOI] [PubMed] [Google Scholar]
  • 39.White SJ, et al. Duplications in the DMD gene. Hum Mutat. 2006;27:938–945. doi: 10.1002/humu.20367. [DOI] [PubMed] [Google Scholar]
  • 40.Inoue K, et al. Genomic rearrangements resulting in PLP1 deletion occur by nonhomologous end joining and cause different dysmyelinating phenotypes in males and females. Am J Hum Genet. 2002;71:838–853. doi: 10.1086/342728. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Campbell PJ, et al. Identification of somatically acquired rearrangements in cancer using genome-wide massively parallel paired-end sequencing. Nat Genet. 2008;40:722–729. doi: 10.1038/ng.128. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Conrad DF, et al. Mutation spectrum revealed by breakpoint sequencing of human germline CNVs. Nat Genet. 2010;42:385–391. doi: 10.1038/ng.564. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Arlt MF, et al. Comparison of constitutional and replication stress-induced genome structural variation by SNP array and mate-pair sequencing. Genetics. 2011;187:675–683. doi: 10.1534/genetics.110.124776. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Hastings PJ, Ira G, Lupski JR. A microhomology-mediated break-induced replication model for the origin of human copy number variation. PLoS Genet. 2009;5:e1000327. doi: 10.1371/journal.pgen.1000327. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Fiorenza MT, Bevilacqua A, Bevilacqua S, Mangia F. Growing dictyate oocytes, but not early preimplantation embryos, of the mouse display high levels of DNA homologous recombination by single-strand annealing and lack DNA nonhomologous end joining. Dev Biol. 2001;233:214–224. doi: 10.1006/dbio.2001.0199. [DOI] [PubMed] [Google Scholar]
  • 46.Kehrer-Sawatzki H, et al. High frequency of mosaicism among patients with neurofibromatosis type 1 (NF1) with microdeletions caused by somatic recombination of the JJAZ1 gene. Am J Hum Genet. 2004;75:410–423. doi: 10.1086/423624. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.White SJ, den Dunnen JT. Copy number variation in the genome; the human DMD gene as an example. Cytogenet Genome Res. 2006;115:240–246. doi: 10.1159/000095920. [DOI] [PubMed] [Google Scholar]
  • 48.Bruder CE, et al. Phenotypically concordant and discordant monozygotic twins display different DNA copy-number-variation profiles. Am J Hum Genet. 2008;82:763–771. doi: 10.1016/j.ajhg.2007.12.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Piotrowski A, et al. Somatic mosaicism for copy number variation in differentiated human tissues. Hum Mutat. 2008;29:1118–1124. doi: 10.1002/humu.20815. [DOI] [PubMed] [Google Scholar]
  • 50.Mkrtchyan H, et al. Early embryonic chromosome instability results in stable mosaic pattern in human tissues. PLoS ONE. 2010;5:e9591. doi: 10.1371/journal.pone.0009591. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Anglana M, Apiou F, Bensimon A, Debatisse M. Dynamics of DNA replication in mammalian somatic cells: Nucleotide pool modulates origin choice and interorigin spacing. Cell. 2003;114:385–394. doi: 10.1016/s0092-8674(03)00569-5. [DOI] [PubMed] [Google Scholar]
  • 52.Courbet S, et al. Replication fork movement sets chromatin loop size and origin choice in mammalian cells. Nature. 2008;455:557–560. doi: 10.1038/nature07233. [DOI] [PubMed] [Google Scholar]
  • 53.Ge XQ, Jackson DA, Blow JJ. Dormant origins licensed by excess Mcm2-7 are required for human cells to survive replicative stress. Genes Dev. 2007;21:3331–3341. doi: 10.1101/gad.457807. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Karnani N, Dutta A. The effect of the intra-S-phase checkpoint on origins of replication in human cells. Genes Dev. 2011;25:621–633. doi: 10.1101/gad.2029711. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Letessier A, et al. Cell-type-specific replication initiation programs set fragility of the FRA3B fragile site. Nature. 2011;470:120–123. doi: 10.1038/nature09745. [DOI] [PubMed] [Google Scholar]
  • 56.Steinberg MH, et al. Investigators of the Multicenter Study of Hydroxyurea in Sickle Cell Anemia and MSH Patients’ Follow-Up The risks and benefits of long-term use of hydroxyurea in sickle cell anemia: A 17.5 year follow-up. Am J Hematol. 2010;85:403–408. doi: 10.1002/ajh.21699. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Krishnamoorthy K, Thomson J. A more powerful test for comparing two Poisson means. J Statist Plann Inference. 2004;119:23–35. [Google Scholar]
  • 58.Kent WJ, et al. The human genome browser at UCSC. Genome Res. 2002;12:996–1006. doi: 10.1101/gr.229102. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supporting Information
1109272108_sd01.xlsx (32.2KB, xlsx)

Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences

RESOURCES