Skip to main content
Genome Research logoLink to Genome Research
. 2002 Jan;12(1):3–15. doi: 10.1101/gr.214802

Generation and Comparative Analysis of ∼3.3 Mb of Mouse Genomic Sequence Orthologous to the Region of Human Chromosome 7q11.23 Implicated in Williams Syndrome

Udaya DeSilva 1,4,5, Laura Elnitski 3,4, Jacquelyn R Idol 1, Johannah L Doyle 1, Weiniu Gan 2,6, James W Thomas 1, Scott Schwartz 3, Nicole L Dietrich 2, Stephen M Beckstrom-Sternberg 1,2, Jennifer C McDowell 2, Robert W Blakesley 1,2, Gerard G Bouffard 1,2, Pamela J Thomas 2, Jeffrey W Touchman 1,2, Webb Miller 3, Eric D Green 1,2,7
PMCID: PMC155257  PMID: 11779826

Abstract

Williams syndrome is a complex developmental disorder that results from the heterozygous deletion of a ∼1.6-Mb segment of human chromosome 7q11.23. These deletions are mediated by large (∼300 kb) duplicated blocks of DNA of near-identical sequence. Previously, we showed that the orthologous region of the mouse genome is devoid of such duplicated segments. Here, we extend our studies to include the generation of ∼3.3 Mb of genomic sequence from the mouse Williams syndrome region, of which just over 1.4 Mb is finished to high accuracy. Comparative analyses of the mouse and human sequences within and immediately flanking the interval commonly deleted in Williams syndrome have facilitated the identification of nine previously unreported genes, provided detailed sequence-based information regarding 30 genes residing in the region, and revealed a number of potentially interesting conserved noncoding sequences. Finally, to facilitate comparative sequence analysis, we implemented several enhancements to the program PipMaker, including the addition of links from annotated features within a generated percent-identity plot to specific records in public databases. Taken together, the results reported here provide an important comparative sequence resource that should catalyze additional studies of Williams syndrome, including those that aim to characterize genes within the commonly deleted interval and to develop mouse models of the disorder.

[The sequence data described in this paper have been submitted to GenBank under accession nos. AF267747, AF289666, AF289667, AF289664, AF289665, AC091250, AC079938, AC084109, AC024607, AC074359, AC024608, AC083858, AC083948, AC084162, AC087420, AC083890, AC080158, AC084402, AC083889, AC083857, and AC079872.]


The past decade has brought spectacular advances in our understanding of the contiguous gene deletion disorder Williams syndrome (WS, also known as Williams-Beuren syndrome; OMIM 194050 [see http://www.ncbi.nlm.nih.gov/Omim]). This complex and intriguing developmental disorder is associated with defects in multiple physiological systems, with the classic phenotypic features including cardiovascular disease, dysmorphic facial characteristics, infantile hypercalcemia, and unique cognitive and personality components (Burn 1986; Morris et al. 1988; Bellugi et al. 1990, 1999; Lashkari et al. 1999; Mervis et al. 1999; Donnai and Karmiloff-Smith 2000; Mervis and Klein-Tasman 2000; Morris and Mervis 2000).

A key turning point in elucidating the genetic basis of WS came in 1993 with the discovery that the disorder is associated with hemizygous microdeletions within human chromosome 7q11.23 that include the elastin gene (ELN; Ewart et al. 1993). Since that time, there have been numerous studies aiming to map this region of chromosome 7, identify the genes residing within the commonly deleted interval, and associate the phenotypic features of the disorder to the haploinsufficiency of specific genes. These efforts have been aided by a joint effort between our group and the Washington University Genome Sequencing Center (http://genome.wustl.edu/gsc) to map and sequence the human WS region. However, significant challenges have been encountered. For example, attempts to establish contiguous and accurate long-range physical maps of the human WS region have been hampered by a number of problems, including unstable yeast artificial chromosome (YAC) clones derived from the region (which are most likely a consequence of the notably high density of repetitive sequences) and the presence of several large (∼300 kb), closely spaced blocks of DNA with near-identical sequence (Gorlach et al. 1997; Osborne et al. 1997a; Hockenhull et al. 1999; Korenberg et al. 2000; Peoples et al. 2000; Valero et al. 2000). The latter genomic segments, which greatly confound conventional mapping and sequencing strategies, are particularly important, both because they contain gene and pseudogene sequences (Gorlach et al. 1997; Osborne et al. 1997a; Perez Jurado et al. 1998) and because they appear to play a central role in mediating the inter- and intrachromosomal recombination events that lead to the WS-associated deletions (Perez Jurado et al. 1996; Robinson et al. 1996; Baumer et al. 1998).

Despite the challenges associated with mapping and sequencing the human WS region, numerous genes residing within the commonly deleted interval and the flanking duplicated segments have been identified (Fig. 1; Table 1; Francke 1999; Osborne 1999; Osborne and Pober 2001). The diverse phenotypic features associated with WS likely result from haploinsufficiency of these and/or yet-to-be-identified genes that reside within the deleted interval. However, with the exception of ELN and cardiovascular/connective tissue disease, correlating individual genes with specific phenotypic features has proven difficult.

Figure 1.

Figure 1

Long-range organization of human and mouse Williams syndrome (WS) regions. A physical map of the WS regions on human chromosome 7q and mouse chromosome 5G is depicted emphasizing the positions of the known genes residing within and flanking the interval commonly deleted in WS (DeSilva et al. 1999; Francke 1999; Hockenhull et al. 1999; Osborne 1999; Korenberg et al. 2000; Peoples et al. 2000; Valero et al. 2000). In the human WS region, this interval spans ∼1.6 Mb (indicated by a bold dashed line) and is flanked by duplicated blocks of DNA of near-identical sequence (estimated at ∼300 kb in size; indicated by dark rectangles). The relative positions of the centromere (CEN) and telomere (TEL) are indicated in each case. Note the inverted orientation of the two discontiguous segments of human chromosome 7 relative to the single contiguous segment of mouse chromosome 5G. The relative positions of the known human and mouse genes residing in this region are indicated, with additional details provided in Table 1. Depicted below the map of the mouse WS region are the 21 overlapping BAC/PAC clones selected for sequencing (see http://bio.cse.psu.edu/publications/desilva for a complete contig map of the mouse WS region), with the current sequencing status (finished, full shotgun, or working draft) indicated at the bottom (also see Table 2). Note that the depicted genomic regions and the BAC/PAC clones are not drawn to scale.

Table 1.

Known Human/Mouse Genes Residing Within or Near the WS Region

Name (human/mouse) Other name(s) Reference



Reside in single-copy interval commonly deleted in WS
FKBP6/Fkbp6 Meng et al. 1998b
FZD9/Fzd9 FZD3 Wang et al. 1997; Wang et al. 1999
BAZ1B/Baz1b WSTF, WBSCR9 Lu et al. 1998; Peoples et al. 1998
BCL7B/Bcl7b Jadayel et al. 1998; Meng et al. 1998a
TBL2/Tbl2 WS-βTRP Meng et al. 1998a; Perez Jurado et al. 1999
WBSCR14/Wbscr14 WS-bHLH Meng et al. 1998a; de Luis et al. 2000
STX1A/Stx1a Osborne et al. 1997b; Nakayama et al. 1998
CLDN3/Cldn3 CPETR2 Paperna et al. 1998
CLDN4/Cldn4 CPETR1 Paperna et al. 1998
ELN/Eln Fazio et al. 1991; Ewart et al. 1993; Wydner et al. 1994
LIMK1/Limk1 Frangiskakis et al. 1996; Tassabehji et al. 1996
EIF4H/Eif4h WBSCR1 Osborne et al. 1996
WBSCR15/Wbscr15 WBSCR5 Doyle et al. 2000; Martindale et al. 2000
RFC2/Rfc2 Peoples et al. 1996
CYLN2/Cyln2 WBSCR3, WBSCR4 Hoogenraad et al. 1998
GTF2IRD1/Gtf2ird1 WBSCR11, MusTRD1,  CREAM1, BEN Tassabehji et al. 1999; Yan et al. 2000;  Bayarsaihan and Ruddle 2000
Reside in duplicated segment in human
GTF2I/Gtf2i TFII-1, BAP135, SPIN Perez Jurado et al. 1998; Wang et al. 1998
NCF1/Ncf1 p47-phox Francke et al. 1990; Jackson et al. 1994;  Gorlach et al. 1997; DeSilva et al. 2000
POM121/Pom121 Hallberg et al. 1993
Reside in regions flanking the WS region in human
GUSB/Gusb Oshima et al. 1987
ASL/Asl Todd et al. 1989
HIP1/Hip1 Wedemeyer et al. 1997
MDH2/Mdh2 Habets et al. 1992
POR/Por Shephard et al. 1992
ZP3/Zp3 van Duin et al. 1992
PAI/Pai Loskutoff et al. 1987
CUTL1/Cutl1 Scherer et al. 1993

As a complement to the above efforts, our interests have focused on the comparative mapping and sequencing of the WS region in the human and mouse genomes. Previously, we established a bacterial clone-based contig map of the mouse genomic region encompassing the Eln and Ncf1 (p47-phox) genes (DeSilva et al. 1999); note that NCF1 gene/pseudogene sequences reside within the duplicated blocks in the human WS region (Fig. 1; Table 1). Interestingly, we discovered that the mouse WS region is devoid of the large duplicated segments that are characteristic of its human counterpart. To acquire a more detailed view of this important genomic interval, we have now extended our mouse physical mapping efforts as well as sequenced the entire mouse WS region. Here, we report the generation of ∼3.3 Mb of mouse genomic sequence and the results of detailed computational analyses, which included extensive comparisons with the available sequence of the human WS region.

RESULTS

Physical Mapping of the Mouse WS Region

The segment of the mouse genome corresponding to the human WS region resides on distal mouse chromosome 5. Our previous clone-based physical mapping efforts resulted in the construction of a bacterial artificial chromosome (BAC)/P1-derived artificial chromosome (PAC) contig spanning a large portion of this genomic region, including the entire interval flanked by the Eln and Ncf1 genes (DeSilva et al. 1999). As part of a broader effort to generate BAC-based physical maps of the portions of the mouse genome orthologous to human chromosome 7 (Thomas et al. 2000), we extended this contig map to encompass the entire WS region (including the interval commonly deleted in WS, the segment that is duplicated in human, and additional flanking DNA). The complete contig map is available as part of an electronic supplement accompanying this paper (at http://bio.cse.psu.edu/publications/desilva). Based on our earlier (DeSilva et al. 1999) and expanded physical mapping efforts, a set of 21 clones, which together fully encompass the mouse WS region, was selected for systematic sequencing (Fig. 1).

Consistent with our previous mapping studies (DeSilva et al. 1999), we encountered no evidence for the presence of large, duplicated blocks of DNA within the mouse WS region, such as those residing in the orthologous segment on human chromosome 7q11.23. Indeed, the clone-based physical mapping of the mouse WS region proceeded smoothly, in striking contrast to our efforts and those of others (Osborne et al. 1996; Hockenhull et al. 1999; Korenberg et al. 2000; Peoples et al. 2000; Valero et al. 2000) in mapping the human WS region.

The long-range organization of the mouse and human WS regions is also different in other ways. Specifically, a single contiguous block of mouse chromosome 5 encompassing the WS region is orthologous to two discontiguous segments of human chromosome 7, one on 7q11.23 and one on 7q22. The former segment contains the interval commonly deleted in WS and the flanking duplicated blocks; interestingly, the orientation of the central portion of this region is inverted in mouse versus human (Fig. 1). The inverted orientation of the mouse WS region (compared to the human WS region) was confirmed by two-color fluorescent in situ hybridization (FISH) studies with Ncf1– and Fkbp6-containing BACs; the results clearly showed that Ncf1 is at the centromeric end and Fkbp6 at the telomeric end of the WS region on mouse chromosome 5 (data not shown). These physical mapping studies are consistent with the BSS JAX panel genetic mapping data (http://www.jax.org/resources/documents/cmdata/bkmap/BSS.html). Importantly, the breakpoints associated with this evolutionary inversion correspond to the locations of the duplicated blocks in the human WS region, which are also the most common sites of deletion breakpoints seen in WS (Fig. 1). Our finding of an inverted orientation of the mouse versus human WS region is consistent with data generated by others (Peoples et al. 2000; Valero et al. 2000).

Immediately telomeric to the interval commonly deleted in WS is a genomic segment encompassing the HIP1/Hip1, MDH2/Mdh2, POR/Por, and ZP3/Zp3 genes; this region is oriented the same in mouse and human. However, in mouse, this segment is contiguous (in the telomeric direction) with a region that is orthologous to human 7q22 and that contains the Cutl1 and Pai genes. In human, this segment is not contiguous with the WS region and, in fact, is inverted in orientation (relative to the mouse segment; see Fig. 1).

Sequencing of the Mouse WS Region

The 21 overlapping mouse clones depicted in Figure 1 were sequenced by a shotgun sequencing strategy. The GenBank accession number for each resulting sequence is provided in Table 2. Note that the first five clones (391O16, 92N10, P510M19, 303E12, and 42J20) were isolated from libraries derived from the 129SV mouse strain and sequenced prior to the decision to use the C57BL/6J mouse strain (with an emphasis on the RPCI-23 mouse BAC library) for sequencing the mouse genome as part of the Human Genome Project (Battey et al. 1999; Denny and Justice 2000). The remaining 16 clones were isolated from the RPCI-23 library. Taken together, a total of ∼3.3 Mb of nonredundant mouse genomic sequence was generated, of which a single contiguous block of just over 1.4 Mb is finished, high-accuracy sequence (i.e., with an error rate of <1 in 10,000 bp), another ∼1.4 Mb is at a full-shotgun stage (with ∼11-fold average coverage in Phred Q20 bases; Ewing et al. 1998; Ewing and Green 1998) and is currently being finished, and the remaining ∼0.5 Mb is at a working-draft stage (with ∼5-fold average coverage in Phred Q20 bases), as indicated in Figure 1 and Table 2.

Table 2.

Sequenced Mouse Clones

Clone name Clone type Status GenBank No.




391O16 BAC Finished AF267747
92N10 BAC Finished AF289666
P510M19 PAC Finished AF289667
303E12 BAC Finished AF289664
42J20 BAC Finished AF289665
RP23-315E02 BAC Finished AC091250
RP23-201C09 BAC Finished AC079938
RP23-38B15 BAC Finished AC084109
RP23-240H06 BAC Finished AC024607
RP23-289J24 BAC Finished AC074359
RP23-333I24 BAC Finished AC024608
RP23-423A22 BAC Full Shotgun AC083858
RP23-67P07 BAC Full Shotgun AC083948
RP23-11P12 BAC Full Shotgun AC084162
RP23-314O01 BAC Full Shotgun AC087420
RP23-284P20 BAC Full Shotgun AC083890
RP23-311J12 BAC Full Shotgun AC080158
RP23-419B02 BAC Full Shotgun AC084402
RP23-271A20 BAC Working Draft AC083889
RP23-372D12 BAC Working Draft AC083857
RP23-299E16 BAC Working Draft AC079872

Mouse–Human Comparative Sequence Analysis

The resulting mouse genomic sequence was subjected to rigorous computational analyses. Emphasis was placed on studying the large (∼1.4 Mb), contiguous block of finished sequence, which included the entire region orthologous to the interval commonly deleted in WS. For comparison to the finished mouse sequence, we were able to identify finished or draft-level human sequence in GenBank for all but ∼200 kb of the corresponding region on human chromosome 7q11.23 (with the notable segments unavailable for comparative analyses being ∼40 kb encompassing the gene represented by AK005040, ∼100 kb at the 5′ end of ELN, and ∼20 kb just 5′ to CLDN3).

The central analytical and organizational tool for our comparative sequence analyses was the program PipMaker (Hardison et al. 1997; Ellsworth et al. 2000; Schwartz et al. 2000). The core function of this program is to perform direct comparisons between large blocks of orthologous sequences. In addition, though, PipMaker provides an effective and convenient mechanism for assimilating and displaying relevant annotations about large segments of genomic sequence, including the location of repetitive elements and CpG islands, the intron–exon organization of genes, and, most importantly, the areas (both coding and noncoding) found to be highly conserved between two orthologous sequences. To enhance the utility of PipMaker, we recently added a feature that incorporates hyperlinks from annotated regions of the resulting percent-identity plot (PIP) to relevant Internet sites. This allows the creation of an informative and dynamic electronic supplement that captures the key elements of each comparative analysis. An illustration of this new PipMaker feature is provided in Figure 2, which shows a small portion of the PIP generated by comparing the sequences of the mouse and human WS regions (note that the entire PDF-formatted PIP is available at http://bio.cse.psu.edu/publications/desilva).

Figure 2.

Figure 2

Representative portion of the percent-identity plot (PIP) comparing mouse and human sequence from the Williams syndrome (WS) region. The finished mouse sequence reported here was compared with the available orthologous human sequence using PipMaker. The complete PIP and details about the various annotations it contains are available at http://bio.cse.psu.edu/publications/desilva. Shown here is a ∼60-kb region containing portions of the Gtf2i/GTF2I and Gtf2ird1/GTF2IRD1 genes and the interval residing between them. Note that only gap-free segments that are ≥50% identical between mouse and human are plotted. The first two exons and last nine exons of Gtf2i/GTF2I and Gtf2ird1/GTF2IRD1, respectively, are represented by vertical rectangles and numbered accordingly; most of these exons are associated with high levels of mouse–human sequence conservation. Note the two conserved noncoding sequences at ∼205 kb and ∼239 kb (both are gap-free segments of >100 bp in length with mouse–human sequence identities of >70% and >90%, respectively, as indicated by the different colored vertical lines at those positions). Also note the various colored horizontal bars drawn above the two genes; in the actual PDF file generated by PipMaker, these bars provide direct links to relevant Internet sites (e.g., appropriate PubMed citation[s] for the gene [pink], the GenBank record containing the predicted amino acid sequence of the protein encoded by the gene [light blue], and the LocusLink entry for the gene [dark blue]). The bookmarks along the left side provide links to compiled information about the various genes and other annotations generated during the comparative analysis of these sequences.

Our comparative analyses revealed a number of interesting general features of the WS region. First, the GC content of the mouse and human WS regions is similar, both the overall level (48.8% and 49.2%, respectively) and the relative uniformity across the region (ranging from 41.7% to 51.7% in mouse and 40.2% to 55.5% in human when calculated in 50-kb windows). In contrast, the mouse and human WS regions differ substantially in their repeat content, for example, consisting of 35.9% and 54.2% interspersed repetitive elements (mostly SINES and LINES), respectively. In addition, there is a notable lack of uniformity of repeat content across the region, ranging from 30.6% to 62.7% in mouse and 27.9% to 84.3% in human (when calculated in 50-kb windows). The difference in the amount of repetitive sequences largely accounts for the slight compression of the mouse WS region compared to its human counterpart. For example, this is clearly evident in the interval encompassing the genes GTF2IRD2/Gtf2ird2, NCF1/Ncf1, and GTF2I/Gtf2i, with finished sequence being available for both the mouse and human regions; the size of the same genomic segment is ∼124 kb and ∼169 kb in mouse and human, respectively (consisting of 34.3% and 50.0% interspersed repeats, respectively). Finally, PipMaker analysis revealed numerous segments that are highly conserved between the mouse and human WS regions. Most of these correspond to exons within known and newly identified genes (see below); however, many others appear to be conserved noncoding sequences. Specifically, within the ∼1.4 Mb of finished mouse sequence, 55 gap-free alignments of ≥100 bp in length and with ≥70% mouse–human sequence identity were identified that do not overlap any of the identified exons. Two of these are shown in Figure 2, with the complete list available at http://bio.cse.psu.edu/publications/desilva.

PipMaker analysis also revealed that mouse–human sequence conservation across the WS region is relatively low compared to other genomic regions examined to date, both in terms of the total amount of noncoding, nonrepetitive sequence that is at least moderately conserved (i.e., can be reliably aligned between mouse and human) and the amount that is highly conserved. To quantify this, we focused attention on the finished sequence from the mouse WS region. Following removal of segments for which the orthologous human sequence was not available and the masking of both repeats and annotated coding regions, the remaining mouse sequence was aligned with its human counterpart. Only 20.3% of the nonexonic, nonrepetitive sequence could be aligned between mouse and human, providing a benchmark for the overall level of conservation (Table 3). Only 1.1% of the sequence was found to be highly conserved (i.e., resided within a gap-free alignment of ≥100 bp in length and ≥70% mouse–human sequence identity). For comparison, we performed the same analysis on 12 other genomic regions for which large blocks of finished sequence were available for both mouse and human. For these other regions, we first masked repeats and annotated exons in the human (rather than mouse) sequence. In all but two cases, there is a greater degree of total mouse–human sequence conservation than that encountered with the WS region (Table 3), with a greater percentage of highly conserved sequence seen in all but three cases. In addition, the data presented in Table 3 suggest a potential correlation between mouse–human sequence divergence and the content of G+C nucleotides and/or interspersed repetitive elements; note that the latter is consistent with the findings of Chiaromonte et al. (2001). However, a more systematic study is certainly required before firm conclusions can be reached.

Table 3.

Mouse-Human Sequence Conservation in Selected Genomic Regions

Genomic regiona Non-exonic, non-repetitive (unmasked) sequence


Total conserved (%)b Highly conserved (%)c G+C (%)d Length (bp)e Masked (%)f Referenceg






HOXA 99.3 21.3 50.7 93,211 15.2 Unpublished
TCR 77.8 7.0 44.0 77,115 21.0 Koop and Hood 1994
FHIT 58.1 7.6 37.1 331,123 42.1 Shiraishi et al. 2001
CFTR 53.2 4.9 34.9 247,331 41.3 Ellsworth et al. 2000
BTK 49.6 4.9 41.1 43,504 41.0 Oeltjen et al. 1997
SNCA 44.4 1.0 34.6 84,504 29.8 Touchman et al. 2001
DIST1 40.9 0.8 55.3 64,841 45.7 Flint et al. 2001
MECP2 39.7 5.9 47.8 59,670 56.9 Reichwald et al. 2000
CD4 35.6 3.3 51.9 106,531 50.8 Ansari-Lari et al. 1998
CECR 21.3 1.8 45.9 368,778 52.5 Footz et al. 2001
WS region 20.3 1.1 48.9 573,537 49.7 This paper
MYO15 15.4 3.7 56.9 46,035 47.7 Liang et al. 1999
ERCC2 11.0 0 58.5 15,721 61.7 Lamerdin et al. 1996
a

Listed here are 13 genomic regions for which mouse and human genomic sequence is available for comparative analyses. In all cases except for the WS region, finished sequence was available for both mouse and human; in these cases, the name of a known (human) gene within the sequenced region is given. In the case of the WS region, the ∼1.4 Mb of finished mouse sequence was analyzed and an attempt was made to remove mouse sequence for which the orthologous human sequence was not available. 

b

Annotated exons and sequences identified by the RepeatMasker program (using the default settings) were masked in the human sequence (or the mouse sequence in the case of the WS region). The mouse and human sequences were then aligned with the BLASTZ component of PipMaker (using the default settings). In all cases except for the WS region, the human sequence was used as the reference for the PipMaker analysis. Shown in this column is the percentage of the non-exonic, non-repetitive sequence within a mouse-human alignment, reflecting the amount of unmasked sequence with at least moderate levels of mouse–human sequence conservation. 

c

Percentage of the non-exonic, non-repetitive (unmasked) sequence within a gap-free mouse-human sequence alignment of ≥100 bp in length and ≥70% nucleotide identity. 

d

Percentage of G+C nucleotides in the non-exonic, non-repetitive (unmasked) sequence. 

e

Total length (in bp) of the non-exonic, non-repetitive (unmasked) sequence. 

f

Percentage of the entire region masked as repetitive or exonic. 

g

All of the mouse and human genomic sequences used for the analysis summarized in this table are in GenBank. When available, a citation reporting the mouse and/or human sequence for the region is provided. 

Significant effort was also focused on the computational detection and annotation of genes residing in the WS region. The availability of both mouse and human genomic sequences greatly enhanced the ability to detect genes and to define their long-range organization. Table 4 provides a summary of the 30 genes identified within the ∼1.4 Mb of finished mouse sequence, with additional details (e.g., deduced coding sequences, predicted amino acid sequences of the corresponding proteins, and presence of conserved domains) available at http://bio.cse.psu.edu/publications/desilva. Of these 30 genes, 20 have been assigned names and reported previously as residing within the WS region (see Table 1), while one (Gtf2ird2) is associated with an annotated GenBank record (AY014963) indicating its presence in the WS region. Importantly, the remaining 9 (in each case indicated in Table 4 by a representative GenBank record containing a corresponding full-length cDNA sequence or an associated expressed-sequence tag [EST]) represent newly identified genes with respect to their presence in the WS region. The evidence that these are authentic genes includes the identification of cDNA sequences matching the mouse genomic sequence, their overlap with GenScan-predicted gene models (in all but one case), and the presence of strong mouse–human sequence conservation; these features are detailed in Figure 3. Remarkably, 6 of these newly identified genes (AK017044, AK004244, AK008014, AK003386, AK019256, and BE290321) clearly reside within the genomic interval commonly deleted in WS. Additional features of the newly identified genes are summarized in an electronic table at http://bio.cse.psu.edu/publications/desilva.

Table 4.

Genes Identified in the ∼1.4 Mb of Finished Sequence from the Mouse WS Region

Genea CpG islandb Mouse-human comparisons



CDS length in bp, mouse (human)c CDS, % identityd AA sequence, % identitye



AK005040 Yes 1163 (NA) NA NA
Gtf2ird2 Yes 2811 (2673) 79.3 82.1
Ncf1 No 1173 (1170) 81.4 82.5
Gtf2i Yes 2940 (2937) 87.7 96.8
Gtf2ird1 Yes 2071 (2077) 88.0 87.1
Cyln2 Yes 3136 (3134) 86.0 91.4
Rfc2 Yes 1050 (1066) 84.9 92.8
Wbscr15 No 576 (610) 74.0 64.6
Eif4h Yes 747 (747) 91.3 98.4
Limk1 Yes 1944 (1944) 88.0 95.2
Eln Yes 2582 (2274) 81.4 81.8
Cldn4 Yes 631 (628) 82.7 83.2
Cldn3 Yes 660 (663) 88.2 91.3
AK017044 No 838 (NA) NA NA
AK004244 Yes 924 (NA) NA NA
AK008014 No 544 (529) 75.3 NA
Stx1a Yes 867 (863) 91.0 98.3
AK003386 Yes 1135 (1045) 81.2 74.9
AK019256 Yes 530 (530) 78.8 76.3
BE290321 Yes 521 (NA) NA NA
Wbscr14 Yes 2595 (2559) 83.9 81.6
Tbl2 Yes 1329 (1344) 85.4 87.8
Bcl7b Yes 546 (546) 88.5 94.6
Baz1b Yes 4440 (4452) 86.6 91.1
Fzd9 Yes 1648 (1648) 87.6 95.8
Fkbp6 Yes 864 (864) 81.6 86.0
BF522554 Yes 1455 (1466) 84.2 78.8
BE630793 Yes 1211 (1212) 83.2 NA
Pom121 Yes 3361 (3440) 78.1 71.1
Hip1 Yes 2518 (2518) 87.6 87.6
a

The 30 genes identified within the ∼1.4 Mb of finished sequence from the mouse WS region are listed in their order on mouse chromosome 5G1-G2 (from centromere to telomere; see Fig. 1). Of these 30 genes, 21 have been previously published (listed in Table 1 and depicted in Fig. 1) or, in the case of Gtf2ird2, submitted as an annotated GenBank record (AY014963). In the case of the 9 genes previously not reported as residing in the WS region, representative GenBank accession numbers are provided (see Fig. 3). 

b

The presence (yes) or absence (no) of an overlap between the 5′ exon of the gene and a CpG island (regions of ≥50% G+C content where the ratio of CpG dinucleotides relative to GpC is ≥60% within a 200-bp window) is indicated. In two cases (BE290321 and BF522554), cDNA sequence was not available to define the 5′ exons; instead, the 5′ exons were predicted by GenScan based on extending an existing EST (to a methionine codon). 

c

The length of each mouse coding sequence (CDS) was established by one of several methods. If a mouse RefSeq entry was available for the gene (http://www.ncbi.nlm.nih.gov/LocusLink/refseq.html), the length of the CDS in that record was used. In the absence of a mouse RefSeq record but presence of a human gene sequence (HIP1), a BLASTZ alignment was used to identify the putative mouse coding and predicted amino acid sequences. In the absence of a human gene, other sources were used to annotate the mouse genes. For example, the rat Pom121 gene aligned with the mouse genomic sequence at >85% identity with precise exon boundaries and was therefore used to annotate the mouse Pom121 exons. Two genes (BE522554 and BE630793) were identified by a MegaBLAST search of the mouse genomic sequence against the TIGR EST database (http://www.tigr.org/tdb/tgi.shtml); the resulting information was used in conjunction with GenScan to establish the mouse gene model. The length of each human coding sequence was estimated by PipMaker (this was done for consistency because there was no corresponding human RefSeq record nor human LocusLink mRNA entry for roughly a third of the mouse genes). Of note, analyses performed using available human RefSeq records yielded the same results as those obtained using the PipMaker-predicted human coding sequences; in one case (ELN), PipMaker failed to predict a human coding sequence; in this case, the available RefSeq record was used. In one case (GTF2IRD2), PipMaker failed to predict a coding sequence and no full-length human cDNA sequence was available in GenBank; in this case, a GenScan prediction of the human coding sequence was used. In four cases (indicated by NA), none of the above means for predicting the human coding sequence was effective, most often due to the lack of available human genomic or cDNA sequence. 

d

The tool EMBOSS (http://www.ebi.ac.uk/emboss/align), which uses the Needleman-Wunsch global alignment algorithm to find the optimum alignment (including gaps) of two sequences when considering their entire length, was used to calculate the percent-identity of the mouse and human coding sequences over the aligned regions. In four cases, no human coding sequence was available for this analysis (indicated by NA). 

e

The predicted amino acid (AA) sequence derived from each orthologous mouse–human gene pair was compared using EMBOSS. The indicated percent-identity corresponds to the percentage of the total amino acids with identical matches between the two sequences over the aligned regions. When available, the amino acid sequences were derived from RefSeq records; otherwise, matching GenBank protein records were used. In the case of BF522554, neither of these sources was available; thus, a translated version of the coding sequence predicted by PipMaker was used. When PipMaker failed to predict a human coding sequence for a mouse gene or no open reading frame could be found in the predicted coding sequence, BLASTX or BLASTP was used to search the National Center for Biotechnology Information database. For three genes (AK003386, AK019256, and Pom121), this yielded an aligning human protein (XP_042880, XP_042882, and XP_034753.1, respectively). In some cases (indicated by NA), amino acid sequence alignments could not generated, either because the mouse coding sequence did not provide an open reading frame that enabled an accurate prediction of a protein sequence or a human amino acid sequence could not be obtained for alignment with the predicted mouse protein. 

Figure 3.

Figure 3

Identification of previously unreported genes in the Williams syndrome (WS) region. Of the 30 genes identified within the ∼1.4 Mb of finished mouse sequence (see Table 4), 9 have not been previously reported to reside within the WS region. Information about each of these 9 genes is provided (listed in order across the mouse WS region), including (1) a representative GenBank accession number for the mouse cDNA sequence (note in one case, BF522554, the only available cDNA sequence was from rat); (2) the type of sequence contained in that GenBank record (Riken full-length [FL] cDNA sequence [Kawai et al. 2001] or EST); (3) the percent-identity between the mouse genomic sequence and the matching cDNA sequence; (4) an indication of whether or not the putative gene overlaps a GenScan-predicted gene (specifically, if >1 exon matches a Genscan-predicted exon or, in the case of AK019256, the single exon matches the predicted exon for >500 bp; note that the only gene not meeting these criteria, AK017044, did have one of its exons matching a Genscan-predicted exon); and (5) the gene-containing portion of the percent-identity plot (PIP) showing the pattern of mouse–human sequence conservation (except for AK005040 and AK017044, for which no human sequence was available). See Fig. 2 for additional details about the PIP.

The 30 identified genes are associated with a number of other interesting features. First, all but 4 (87%) have a CpG island at their 5′ end (Table 4); this is a considerably higher fraction than that reported previously for mouse genes (Antequera and Bird 1993; Jareborg et al. 1999). Second, the splice sites and intron–exon organization of the genes are the same in mouse and human (at least for the genes for which genomic sequence was available in both species) except for Eln/ELN, which has 81% amino acid identity between mouse and human but shows a lack of conservation at the splice junctions. Third, the coding-sequence conservation between the mouse–human orthologous gene pairs (Table 4) falls within the typical range established previously (Makalowski et al. 1996; Makalowski and Boguski 1998), with the exceptions being the less conserved Wbscr15/WBSCR15 (as we reported previously [Doyle et al. 2000]) and perhaps Pom121/POM121. Finally, with the exception of the changes associated with the evolutionary inversions depicted in Figure 1, gene order is the same in the mouse and human WS regions.

The ∼1.9-Mb segment of draft-level mouse sequence that we generated (corresponding to the seven clones taken to full-shotgun and three clones taken to working-draft levels of redundancy; see Table 2) is orthologous to a region of human chromosome 7 that is telomeric to the interval commonly deleted in WS (Fig. 1). As such, less rigorous computational analyses have thus far been performed with this mouse sequence. However, since human sequence is available for virtually all of this segment, a routine set of comparative analyses was performed using PipMaker, with the resulting PIPs available at http://bio.cse.psu.edu/publications/desilva.

DISCUSSION

It is now well-established that the comparative analysis of genomic sequence from different organisms represents a powerful means for identifying conserved coding and noncoding regions, including regulatory elements (Duret and Bucher 1997; Hardison et al. 1997; Hardison 2000; Miller 2000; Wasserman et al. 2000; Cliften et al. 2001; Pennacchio and Rubin 2001; Touchman et al. 2001). With the recent completion of a working-draft sequence of the human genome (International Human Genome Sequencing Consortium 2001; Venter et al. 2001), increasing attention is being given to the sequencing of other organisms (Green 2001). In particular, the sequencing of the mouse genome is now taking center stage (Battey et al. 1999; Denny and Justice 2000), with the recognition that the resulting data will provide both an invaluable infrastructure for performing research with this important experimental animal and the ability to more rigorously annotate the human sequence by comparative analyses (Batzoglou et al. 2000; Bouck et al. 2000).

Indeed, the past few years have brought a sizable crescendo in the generation of mouse genomic sequence, allowing insightful comparisons to be made with the orthologous human sequence. Notable examples of large (e.g., >300 kb) blocks of generated mouse sequence include that from the velocardiofacial syndrome region (∼634 kb; Lund et al. 2000), the Cftr region (∼358 kb; Ellsworth et al. 2000), the Bpa/Str region (∼430 kb; Mallon et al. 2000), the region on chromosome 7 containing an imprinted genomic domain (∼1 Mb; Onyango et al. 2000), the region on chromosome 11 containing a cluster of interleukin genes (∼1100 kb; Loots et al. 2000), the region containing the protocadherin gene cluster (∼900 kb; Wu et al. 2001), the cat eye syndrome region (∼450 kb; Footz et al. 2001), the region on chromosome 17 containing a cluster of olfactory receptor genes (∼330 kb; Younger et al. 2001), a segment on mouse chromosome 16 orthologous to the Down's syndrome critical region (∼470 kb; Pletcher et al. 2001), the Fra14A2/Fhit region (∼600 kb; Shiraishi et al. 2001), and the 15 mouse genomic segments orthologous to human chromosome 19 (totaling ∼42 Mb; Dehal et al. 2001); note that a handful of other examples are also cataloged at www.ncbi.nlm.nih.gov/genome/seq/MmProgress.shtml. Together, the generated mouse sequence has played a key role in the establishment and refinement of computational approaches for systematic comparative sequence analysis (Mallon and Strivens 1998; Stojanovic et al. 1999; Batzoglou et al. 2000), with the emergence of tools such as PipMaker (http://bio.cse.psu.edu; Schwartz et al. 2000), VISTA (http://sichuan.lbl.gov/vista; Mayor et al. 2000), and Alfresco (http://www.sanger.ac.uk/Software/Alfresco; Jareborg and Durbin 2000).

The ∼3.3 Mb of sequence reported here for the mouse WS region represents one of the largest and most complete blocks of mouse sequence reported to date. This is particularly the case with respect to the ∼1.4-Mb contiguous segment of finished, high-accuracy sequence. Indeed, in many of the cases listed above, only draft-level mouse sequence has thus far been generated. Our extensive and high-quality data set provided the opportunity to perform detailed computational analyses, with particular emphasis on mouse–human sequence comparisons. Several general findings deserve special mention. First, the order and structure of genes in the mouse and human WS regions are well conserved, with the only exceptions relating to the two large evolutionary inversions illustrated in Figure 1. Second, comparative sequence analysis in conjunction with cDNA/EST comparisons and Genscan predictions has provided strong evidence for the presence of at least nine previously unreported genes within the WS region (see Fig. 3 and below). Finally, numerous conserved noncoding sequences can be readily identified within the human and mouse WS regions; these represent viable candidates for regulatory elements associated with the numerous genes residing in the region or perhaps serve some other biologically important function(s). Of note, during the generation of our mouse sequence data, Martindale et al. (2000) reported the elucidation and analysis of ∼115 kb of sequence from the mouse WS region, specifically a segment encompassing the genes Limk1, Eif4h, Wbscr15, and Rfc2. Their analyses of this portion of the mouse WS region are concordant with the results presented here.

Our experience in analyzing the sequence of the mouse WS region once again illustrates the tremendous value of mouse–human sequence comparisons for annotating genes. Simple comparisons of genomic sequences and collections of cDNA-derived (e.g., EST) sequences often fail to detect certain mRNAs (e.g., those expressed at low levels or in a tissue-restricted fashion). In addition, false-positive results are common, typically due to contaminating genomic sequences amongst the ESTs. However, a combined strategy employing both mouse–human genomic sequence comparisons and genomic-cDNA sequence comparisons provides an efficient and effective path toward the construction of accurate gene models. For example, such a combined approach led to our identification of a previously undetected 5′ terminal exon of HIP1/Hip1, leading to refined information about the structure of this gene beyond that available in RefSeq. In addition, evidence of mouse–human sequence conservation provided critical clues that directly led to the identification of the nine previously unreported genes in the WS region. Once detected, the conserved regions were more carefully compared to available sequence databases, resulting in the identification of matching full-length cDNA sequences in a majority of cases.

PipMaker is now a well-established program for performing the types of routine comparative sequence analyses mentioned above. The new enhancements to PipMaker reported here should further increase the utility of this tool. In particular, PipMaker can now be used to capture and disseminate the large amount of ancillary information that is routinely generated during the comparative analysis of large blocks of genomic sequence, in essence providing an archive of both the underlying data and a detailed account of any analyses performed with it. This is accomplished through the creation of a PDF-based file that contains both the PIP and links from relevant features of the PIP to specific Internet sites. Such a PDF file can serve as an electronic supplement to a publication, which inevitably can only provide highlights of the comparative analyses being reported (e.g., Figs. 2, 3). Indeed, this is just one facet of the expanding synergy between traditional scientific publishing and the Internet. An alternate approach to this problem was recently described (Wilson et al. 2001), which involves the use of a sequence-alignment viewer that is provided as part of the electronic supplement and downloaded automatically by the Web browser when viewing alignments. An advantage of the Wilson et al. strategy is that it provides greater interactivity to the end-user, for example, allowing access to alignments with nucleotide-level resolution. An advantage of PipMaker is that it only utilizes features of the PDF language, making the supplemental archive much easier to create and to access.

The region of human chromosome 7q11.23 commonly deleted in WS is of great medical and biological interest because of the relative frequency of the disease (∼1:20,000), the complex and intriguing phenotypic features of WS (Burn 1986; Morris et al. 1988; Bellugi et al. 1990, 1999; Lashkari et al. 1999; Mervis et al. 1999; Donnai and Karmiloff-Smith 2000; Mervis and Klein-Tasman 2000; Morris and Mervis 2000), and the involvement of large, duplicated blocks of DNA in the deletional events leading to the syndrome (Perez Jurado et al. 1996; Robinson et al. 1996; Baumer et al. 1998). The mouse sequencing efforts reported here should accelerate research aiming to better understand the genetic basis of WS. First, our data provide a comprehensive resource for characterizing the genes residing within and around the interval commonly deleted in WS. This includes information about gene structure as well as valuable clues about potential regulatory regions. The value of this mouse sequence deserves highlighting in light of the difficult-to-generate and, at present, fragmentary nature of the human sequence for the WS region. Second, our comparative analyses have revealed the presence of at least nine genes that were not previously known to reside within the WS region. Importantly, six of these genes are located within the interval commonly deleted in WS, making each an important candidate to evaluate for its possible role in the disorder. Finally, the mouse sequence we generated should aid the creation of mouse models of WS. Specifically, significant efforts are currently ongoing to create mouse strains completely deleted or hemizygous for one or more genes within the WS region. Our efforts have provided a key infrastructure (i.e., complete genomic sequence) that should greatly facilitate the design of appropriate knockout constructs as well as a set of additional gene targets. In light of the difficulty to date in assigning specific genes to WS-associated phenotypic features, the ability to generate mouse models is regarded as key for untangling the complex genetics of WS.

In a slightly different context, our studies provide insight about the evolution of the WS region and the genes residing therein. Based on our comparative mapping and sequence data, this region has undergone extensive evolutionary changes in the human and/or mouse lineages since their last common ancestor. For example, the genomic complexities (with respect to large, closely spaced duplicated segments) encountered in the human and other great apes are not present in more distantly related mammals, such as the mouse (DeSilva et al. 1999). Interestingly, these duplicated segments reside at the breakpoints associated with an evolutionary inversion, such that the interval commonly deleted in WS has an inverted orientation in the human versus the mouse genome. In addition, there is a second evolutionary inversion associated with a genomic segment residing just telomeric to the WS region; this segment is contiguous with the rest of the WS region in mouse but discontiguous in human. It is interesting to contemplate the steps that produced two evolutionary inversions and one breakpoint within the human and mouse lineages, as discussed by Valero et al. (2000). At a sequence level, there is also evidence for significant divergence between the mouse and human WS regions. Indeed, the overall level of mouse–human sequence conservation across the WS region is atypically low; this is particularly the case for the noncoding (and nonrepetitive) sequence (Table 3), but is also evident for some genes (e.g., Wbscr15/WBSCR15 [Doyle et al. 2000; Martindale et al. 2000] and Pom121/POM121; see Table 4).

In summary, our studies show how comparative sequence analysis can simultaneously provide valuable data for addressing problems in both human genetics and genome evolution. Based on this experience and the anticipated surge in the acquisition of genomic sequence for numerous other organisms, one can now readily envision a new era of scientific inquiry, in which sequence-based comparisons drive the study of genome structure, function, and evolution.

METHODS

Mouse Genomic Sequencing

The overlapping set of mouse BAC (Shizuya et al. 1992) and PAC (Ioannou et al. 1994) clones shown in Figure 1 and listed in Table 2 were selected from either the contig reported previously (DeSilva et al. 1999; specifically, clones 391O16, 92N10, 303E12, and 42J20 isolated from the Research Genetics CITB-CJ7-B [strain 129SV] mouse BAC library [http://www.resgen.com] and clone P510M19 isolated from the RPCI-21 [strain 129SV] mouse PAC library [http://www.chori.org/bacpac]) or one more recently constructed as part of a larger mouse mapping effort (Thomas et al. 2000; specifically, clones with the prefix ‘RP23’ that were isolated from the RPCI-23 [strain C57BL/6J] mouse BAC library [http://www.chori.org/bacpac; Osoegawa et al. 2000]). Colony-pure clone isolates were subjected to restriction enzyme digest-based fingerprint analysis (Marra et al. 1997), and the resulting data were analyzed with the programs Image and FPC (http://www.sanger.ac.uk/Software; Soderlund et al. 1997, 2000) to assemble BAC/PAC contig maps, which in turn were used to guide the selection of overlapping clones for sequencing. Each selected clone was subjected to shotgun sequencing (Wilson and Mardis 1997; Green 2001), essentially as described previously (DeSilva et al. 2000; Ellsworth et al. 2000; Touchman et al. 2000). Sequences were edited and assembled with the Phred/Phrap/Consed suite of programs (Ewing et al. 1998; Ewing and Green 1998; Gordon et al. 1998).

Comparative Analyses of Mouse and Human Sequences

The generated mouse sequence reported here was subjected to detailed computational analyses, including comparisons with the orthologous human sequence (when available). Genomic sequence from the human WS region was obtained as follows. The available sequence encompassing the LIMK1-RFC2 interval (Martindale et al. 2000) was supplemented with individual sequence records found by searching the NCBI databases (nr and htgs); most often, these records contained draft-level (as opposed to finished) sequence. In some cases, only small sequence contigs were available. For example, the CLDN3 gene could only be found on a ∼1.6-kb stretch of sequence, with the regions immediately flanking the gene not available for comparison with the mouse sequence.

Mouse and human genomic sequences were compared by constructing a percent-identity plot (Hardison et al. 1997; Ellsworth et al. 2000; Schwartz et al. 2000). Specifically, the generated mouse sequence and available human sequence were subjected to repeat masking with the RepeatMasker program (A.F.A. Smit and P. Green, unpubl. data; see http://www.genome.washington.edu/UWGC/analysistools/repeatmask.htm). The human sequence was then aligned relative to the mouse sequence using the BLASTZ component of the PipMaker program (http://bio.cse.psu.edu; Schwartz et al. 2000). In the resulting PIP, segments that were ≥50% identical between mouse and human were plotted, with other regions appearing blank. Gaps within an alignment appear as discontinuities between adjacent horizontal lines. Representative portions of the PIP generated with the sequences from the mouse and human WS regions are shown in Figures 2 and 3, with a more complete summary of the PipMaker results available at http://bio.cse.psu.edu/publications/desilva. Additional information about the range of computational analyses performed is also detailed in Tables 3 and 4.

Acknowledgments

We thank the staff of the NIH Intramural Sequencing Center (NISC) for their dedicated work in generating the mouse sequence reported here, with special thanks to Michelle Walker, Jyoti Gupta, Sirintorn Stantripop, and Quino Maduro for their efforts in sequence finishing. We also thank the Washington University Genome Sequencing Center for generating the human sequence; Amalia Dutra for FISH studies; Jennifer Munsterteiger for editorial assistance; and Elliott Margulies, Matthew Portnoy, and Arjun Prasad for critical review of the manuscript. This work was supported in part by grant HG02238 (W.M.), grant HG02325-01 (L.E.), and funds for mouse sequencing (E.D.G.) from the National Human Genome Research Institute (NIH).

The publication costs of this article were defrayed in part by payment of page charges. This article must therefore be hereby marked “advertisement” in accordance with 18 USC section 1734 solely to indicate this fact.

Footnotes

E-MAIL egreen@nhgri.nih.gov; FAX 301-402-4735.

Article and publication are at http://www.genome.org/cgi/doi/10.1101/gr.214802.

REFERENCES

  1. Ansari-Lari MA, Oeltjen JC, Schwartz S, Zhang Z, Muzny DM, Lu J, Gorrell JH, Chinault AC, Belmont JW, Miller W, et al. Comparative sequence analysis of a gene-rich cluster at human chromosome 12p13 and its syntenic region in mouse chromosome 6. Genome Res. 1998;8:29–40. [PubMed] [Google Scholar]
  2. Antequera F, Bird A. Number of CpG islands and genes in human and mouse. Proc Natl Acad Sci. 1993;90:11995–11999. doi: 10.1073/pnas.90.24.11995. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Battey J, Jordan E, Cox D, Dove W. An action plan for mouse genomics. Nat Genet. 1999;21:73–75. doi: 10.1038/5012. [DOI] [PubMed] [Google Scholar]
  4. Batzoglou S, Pachter L, Mesirov JP, Berger B, Lander ES. Human and mouse gene structure: Comparative analysis and application to exon prediction. Genome Res. 2000;10:950–958. doi: 10.1101/gr.10.7.950. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Baumer A, Dutly F, Balmer D, Riegel M, Tukel T, Krajewska-Walasek M, Schinzel AA. High level of unequal meiotic crossovers at the origin of the 22q11.2 and 7q11.23 deletions. Hum Mol Genet. 1998;7:887–894. doi: 10.1093/hmg/7.5.887. [DOI] [PubMed] [Google Scholar]
  6. Bayarsaihan D, Ruddle FH. Isolation and characterization of BEN, a member of the TFII-I family of DNA-binding proteins containing distinct helix-loop-helix domains. Proc Natl Acad Sci. 2000;97:7342–7347. doi: 10.1073/pnas.97.13.7342. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Bellugi U, Bihrle A, Jernigan T, Trauner D, Doherty S. Neuropsychological, neurological, and neuroanatomical profile of Williams syndrome. Am J Med Genet. 1990;6:115–125. doi: 10.1002/ajmg.1320370621. [DOI] [PubMed] [Google Scholar]
  8. Bellugi U, Lichtenberger L, Mills D, Galaburda A, Korenberg JR. Bridging cognition, the brain and molecular genetics: Evidence from Williams syndrome. Trends Neurosci. 1999;22:197–207. doi: 10.1016/s0166-2236(99)01397-1. [DOI] [PubMed] [Google Scholar]
  9. Bouck JB, Metzker ML, Gibbs RA. Shotgun sample sequence comparisons between mouse and human genomes. Nature Genet. 2000;25:31–33. doi: 10.1038/75563. [DOI] [PubMed] [Google Scholar]
  10. Burn J. Williams syndrome. J Med Genet. 1986;23:389–395. doi: 10.1136/jmg.23.5.389. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Chiaromonte, F., Yang, S., Elnitski, L., Yap, V.B., Miller, W., and Hardison, R.C. 2001. Association between divergence and interspersed repeats in mammalian noncoding genomic DNA. Proc. Natl. Acad. Sci., in press. [DOI] [PMC free article] [PubMed]
  12. Cliften PF, Hillier LW, Fulton L, Graves T, Miner T, Gish WR, Waterston RH, Johnston M. Surveying Saccharomyces genomes to identify functional elements by comparative DNA sequence analysis. Genome Res. 2001;11:1175–1186. doi: 10.1101/gr.182901. [DOI] [PubMed] [Google Scholar]
  13. Dehal P, Predki P, Olsen AS, Kobayashi A, Folta P, Lucas S, Land M, Terry A, Ecale Zhou CL, Rash S, et al. Human chromosome 19 and related regions in mouse: Conservative and lineage-specific evolution. Science. 2001;293:104–111. doi: 10.1126/science.1060310. [DOI] [PubMed] [Google Scholar]
  14. de Luis O, Valero MC, Perez Jurado LA. WBSCR14, a putative transcription factor gene deleted in Williams-Beuren syndrome: Complete characterisation of the human gene and the mouse ortholog. Eur J Hum Genet. 2000;8:215–222. doi: 10.1038/sj.ejhg.5200435. [DOI] [PubMed] [Google Scholar]
  15. Denny P, Justice MJ. Mouse as the measure of man? Trends Genet. 2000;16:283–287. doi: 10.1016/s0168-9525(00)02039-4. [DOI] [PubMed] [Google Scholar]
  16. DeSilva U, Massa H, Trask BJ, Green ED. Comparative mapping of the region of human chromosome 7 deleted in Williams syndrome. Genome Res. 1999;9:428–436. [PMC free article] [PubMed] [Google Scholar]
  17. DeSilva U, Miller E, Gorlach A, Foster CB, Green ED, Chanock SJ. Molecular characterization of the mouse p47-phox (Ncf1) gene and comparative analysis of the mouse p47-phox (Ncf1) gene to the human NCF1 gene. Mol Cell Biol Res Commun. 2000;3:224–230. doi: 10.1006/mcbr.2000.0214. [DOI] [PubMed] [Google Scholar]
  18. Donnai D, Karmiloff-Smith A. Williams syndrome: From genotype through to the cogntive phenotype. Am J Med Genet. 2000;97:164–171. doi: 10.1002/1096-8628(200022)97:2<164::aid-ajmg8>3.0.co;2-f. [DOI] [PubMed] [Google Scholar]
  19. Doyle JL, DeSilva U, Miller W, Green ED. Divergent human and mouse orthologs of a novel gene (WBSCR15/Wbscr15) reside within the genomic interval commonly deleted in Williams syndrome. Cytogenet Cell Genet. 2000;90:285–290. doi: 10.1159/000056790. [DOI] [PubMed] [Google Scholar]
  20. Duret L, Bucher P. Searching for regulatory elements in human noncoding sequences. Curr Opin Struct Biol. 1997;7:399–406. doi: 10.1016/s0959-440x(97)80058-9. [DOI] [PubMed] [Google Scholar]
  21. Ellsworth RE, Jamison DC, Touchman JW, Chissoe SL, Braden Maduro VV, Bouffard GG, Dietrich NL, Beckstrom-Sternberg SM, Iyer LM, Weintraub LA, et al. Comparative genomic sequence analysis of the human and mouse cystic fibrosis transmembrane conductance regulator genes. Proc Natl Acad Sci. 2000;97:1172–1177. doi: 10.1073/pnas.97.3.1172. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Ewart AK, Morris CA, Atkinson D, Jin W, Sternes K, Spallone P, Stock AD, Leppert M, Keating MT. Hemizygosity at the elastin locus in a developmental disorder, Williams syndrome. Nat Genet. 1993;5:11–16. doi: 10.1038/ng0993-11. [DOI] [PubMed] [Google Scholar]
  23. Ewing B, Green P. Base-calling of automated sequencer traces using Phred. II. error probabilities. Genome Res. 1998;8:186–194. [PubMed] [Google Scholar]
  24. Ewing B, Hillier L, Wendl MC, Green P. Base-calling of automated sequencer traces using Phred. I. accuracy assessment. Genome Res. 1998;8:175–185. doi: 10.1101/gr.8.3.175. [DOI] [PubMed] [Google Scholar]
  25. Fazio MJ, Mattei M-G, Passage E, Chu M-L, Black D, Solomon E, Davidson JM, Uitto J. Human elastin gene: New evidence for localization to the long arm of chromosome 7. Am J Hum Genet. 1991;48:696–703. [PMC free article] [PubMed] [Google Scholar]
  26. Flint J, Tufarelli C, Peden J, Clark K, Daniels RJ, Hardison R, Miller W, Philipsen S, Tan-Un KC, McMorrow T, et al. Comparative genome analysis delimits a chromosomal domain and identifies key regulatory elements in the α globin cluster. Hum Mol Genet. 2001;10:371–382. doi: 10.1093/hmg/10.4.371. [DOI] [PubMed] [Google Scholar]
  27. Footz TK, Brinkman-Mills P, Banting GS, Maier SA, Aliriazi M, Riazi MA, Bridgland L, Hu S, Birren B, Minoshima S, et al. Analysis of the cat eye syndrome critical region in humans and the region of conserved synteny in mice: A search for candidate genes at or near the human chromosome 22 pericentromere. Genome Res. 2001;11:1053–1070. doi: 10.1101/gr.154901. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Francke U. Williams-Beuren syndrome: Genes and mechanisms. Hum Mol Genet. 1999;8:1947–1954. doi: 10.1093/hmg/8.10.1947. [DOI] [PubMed] [Google Scholar]
  29. Francke U, Hsieh C-L, Foellmer BE, Lomax KJ, Malech HL, Leto TL. Genes for two autosomal recessive forms of chronic granulomatous disease assigned to 1q25 (NCF2) and 7q11.23 (NCF1) Am J Hum Genet. 1990;47:483–492. [PMC free article] [PubMed] [Google Scholar]
  30. Frangiskakis JM, Ewart AK, Morris CA, Mervis CB, Bertrand J, Robinson BF, Klein BP, Ensing GJ, Everett LA, Green ED, et al. LIM-kinase1 hemizygosity implicated in impaired visuospatial constructive cognition. Cell. 1996;86:59–69. doi: 10.1016/s0092-8674(00)80077-x. [DOI] [PubMed] [Google Scholar]
  31. Franke Y, Peoples RJ, Francke U. Identification of GTF2IRD1, a putative transcription factor within the Williams-Beuren syndrome deletion at 7q11.23. Cytogenet Cell Genet. 1999;86:296–304. doi: 10.1159/000015322. [DOI] [PubMed] [Google Scholar]
  32. Gordon D, Abajian C, Green P. Consed: A graphical tool for sequence finishing. Genome Res. 1998;8:195–202. doi: 10.1101/gr.8.3.195. [DOI] [PubMed] [Google Scholar]
  33. Gorlach A, Lee PL, Roesler J, Hopkins PJ, Christensen B, Green ED, Chanock SJ, Curnutte JT. A p47-phox pseudogene carries the most common mutation causing p47– phox-deficient chronic granulomatous disease. J Clin Invest. 1997;100:1907–1918. doi: 10.1172/JCI119721. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Green ED. Strategies for the systematic sequencing of complex genomes. Nat Rev Genet. 2001;2:573–583. doi: 10.1038/35084503. [DOI] [PubMed] [Google Scholar]
  35. Habets GG, van der Kammen RA, Willemsen V, Balemans M, Wiegant J, Collard JG. Sublocalization of an invasion-inducing locus and other genes on human chromosome 7. Cytogenet Cell Genet. 1992;60:200–205. doi: 10.1159/000133336. [DOI] [PubMed] [Google Scholar]
  36. Hallberg E, Wozniak RW, Blobel G. An integral membrane protein of the pore membrane domain of the nuclear envelope contains a nucleoporin-like region. J Cell Biol. 1993;122:513–521. doi: 10.1083/jcb.122.3.513. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Hardison RC. Conserved noncoding sequences are reliable guides to regulatory elements. Trends Genet. 2000;16:369–372. doi: 10.1016/s0168-9525(00)02081-3. [DOI] [PubMed] [Google Scholar]
  38. Hardison RC, Oeltjen J, Miller W. Long human–mouse sequence alignments reveal novel regulatory elements: A reason to sequence the mouse genome. Genome Res. 1997;7:959–966. doi: 10.1101/gr.7.10.959. [DOI] [PubMed] [Google Scholar]
  39. Hockenhull EL, Carette MJ, Metcalfe K, Donnai D, Read AP, Tassabehji M. A complete physical contig and partial transcript map of the Williams syndrome critical region. Genomics. 1999;58:138–145. doi: 10.1006/geno.1999.5815. [DOI] [PubMed] [Google Scholar]
  40. Hoogenraad CC, Eussen BHJ, Langeveld A, van Haperen R, Winterberg S, Wouters CH, Grosveld F, De Zeeuw CI, Galjart N. The murine CYLN2 gene: Genomic organization, chromosome localization, and comparison to the human gene that is located within the 7q11.23 Williams syndrome critical region. Genomics. 1998;53:348–358. doi: 10.1006/geno.1998.5529. [DOI] [PubMed] [Google Scholar]
  41. International Human Genome Sequencing Consortium. Initial sequencing and analysis of the human genome. Nature. 2001;409:860–921. doi: 10.1038/35057062. [DOI] [PubMed] [Google Scholar]
  42. Ioannou PA, Amemiya CT, Garnes J, Kroisel PM, Shizuya H, Chen C, Batzer MA, de Jong PJ. A new bacteriophage P1-derived vector for the propagation of large human DNA fragments. Nat Genet. 1994;6:84–89. doi: 10.1038/ng0194-84. [DOI] [PubMed] [Google Scholar]
  43. Jackson SH, Malech HL, Kozak CA, Lomax KJ, Gallin JI, Holland SM. Cloning and functional expression of the mouse homologue of p47phox. Immunogenetics. 1994;39:272–275. doi: 10.1007/BF00188790. [DOI] [PubMed] [Google Scholar]
  44. Jadayel DM, Osborne LR, Coignet LJA, Zani VJ, Tsui L-C, Scherer SW, Dyer MJS. The BCL7 gene family: Deletion of BCL7B in Williams syndrome. Gene. 1998;224:35–44. doi: 10.1016/s0378-1119(98)00514-9. [DOI] [PubMed] [Google Scholar]
  45. Jareborg N, Durbin R. Alfresco—a workbench for comparative genomic sequence analysis. Genome Res. 2000;10:1148–1157. doi: 10.1101/gr.10.8.1148. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Jareborg N, Birney E, Durbin R. Comparative analysis of noncoding regions of 77 orthologous mouse and human gene pairs. Genome Res. 1999;9:815–824. doi: 10.1101/gr.9.9.815. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Kawai J, Shinagawa A, Shibata K, Yoshino M, Itoh M, Ishii Y, Arakawa T, Hara A, Fukunishi Y, Konno H, et al. Functional annotation of a full-length mouse cDNA collection. Nature. 2001;409:685–690. doi: 10.1038/35055500. [DOI] [PubMed] [Google Scholar]
  48. Koop BF, Hood L. Striking sequence similarity over almost 100 kilobases of human and mouse T-cell receptor DNA. Nat Genet. 1994;7:48–53. doi: 10.1038/ng0594-48. [DOI] [PubMed] [Google Scholar]
  49. Korenberg JR, Chen X-N, Hirota H, Lai Z, Bellugi U, Burian D, Roe B, Matsuoka R. VI. Genome structure and cognitive map of Williams syndrome. J Cog Neurosci. 2000;12:89–107. doi: 10.1162/089892900562002. [DOI] [PubMed] [Google Scholar]
  50. Lamerdin JE, Stilwagen SA, Ramirez MH, Stubbs L, Carrano AV. Sequence analysis of the ERCC2 gene regions in human, mouse, and hamster reveals three linked genes. Genomics. 1996;34:399–409. doi: 10.1006/geno.1996.0303. [DOI] [PubMed] [Google Scholar]
  51. Lashkari A, Smith AK, Graham JM., Jr Williams-Beuren syndrome: An update and review for the primary physician. Clin Pediatr. 1999;38:189–208. doi: 10.1177/000992289903800401. [DOI] [PubMed] [Google Scholar]
  52. Liang Y, Wang A, Belyantseva IA, Anderson DW, Probst FJ, Barber TD, Miller W, Touchman JW, Jin L, Sullivan SL, et al. Characterization of the human and mouse unconventional myosin XV genes responsible for hereditary deafness DFNB3 and Shaker 2. Genomics. 1999;61:243–258. doi: 10.1006/geno.1999.5976. [DOI] [PubMed] [Google Scholar]
  53. Loots GG, Locksley RM, Blankespoor CM, Wang ZE, Miller W, Rubin EM, Frazer KA. Identification of a coordinate regulator of interleukins 4, 13, and 5 by cross-species sequence comparisons. Science. 2000;288:136–140. doi: 10.1126/science.288.5463.136. [DOI] [PubMed] [Google Scholar]
  54. Loskutoff DJ, Linders M, Keijer J, Veerman H, van Heerikhuizen H, Pannekoek H. Structure of the human plasminogen activator inhibitor 1 gene: Nonrandom distribution of introns. Biochem. 1987;26:3763–3768. doi: 10.1021/bi00387a004. [DOI] [PubMed] [Google Scholar]
  55. Lu X, Meng X, Morris CA, Keating MT. A novel human gene, WSTF, is deleted in Williams syndrome. Genomics. 1998;54:241–249. doi: 10.1006/geno.1998.5578. [DOI] [PubMed] [Google Scholar]
  56. Lund J, Chen F, Hua A, Roe B, Budarf M, Emanuel BS, Reeves RH. Comparative sequence analysis of 634 kb of the mouse chromosome 16 region of conserved synteny with the human velocardiofacial syndrome region on chromosome 22q11.2. Genomics. 2000;63:374–383. doi: 10.1006/geno.1999.6044. [DOI] [PubMed] [Google Scholar]
  57. Makalowski W, Boguski MS. Evolutionary parameters of the transcribed mammalian genome: An analysis of 2,820 orthologous rodent and human sequences. Proc Natl Acad Sci. 1998;95:9407–9412. doi: 10.1073/pnas.95.16.9407. [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Makalowski W, Zhang J, Boguski MS. Comparative analysis of 1196 orthologous mouse and human full-length mRNA and protein sequences. Genome Res. 1996;6:846–857. doi: 10.1101/gr.6.9.846. [DOI] [PubMed] [Google Scholar]
  59. Mallon A-M, Strivens M. DNA sequence analysis and comparative sequencing. Methods. 1998;14:160–178. doi: 10.1006/meth.1997.0575. [DOI] [PubMed] [Google Scholar]
  60. Mallon A-M, Platzer M, Bate R, Gloeckner G, Botcherby MRM, Norksiek G, Strivens MA, Kioschis P, Dangel A, Cunningham D, et al. Comparative genome sequence analysis of the Bpa/Str region in mouse and man. Genome Res. 2000;10:758–775. doi: 10.1101/gr.10.6.758. [DOI] [PMC free article] [PubMed] [Google Scholar]
  61. Marra MA, Kucaba TA, Dietrich NL, Green ED, Brownstein B, Wilson RK, McDonald KM, Hillier LW, McPherson JD, Waterston RH. High throughput fingerprint analysis of large-insert clones. Genome Res. 1997;7:1072–1084. doi: 10.1101/gr.7.11.1072. [DOI] [PMC free article] [PubMed] [Google Scholar]
  62. Martindale DW, Wilson MD, Wang D, Burke RD, Chen X, Duronio V, Koop BF. Comparative genomic sequence analysis of the Williams syndrome region (LIMK1-RFC2) of human chromosome 7q11.23. Mamm Genome. 2000;11:890–898. doi: 10.1007/s003350010166. [DOI] [PubMed] [Google Scholar]
  63. Mayor C, Brudno M, Schwartz JR, Poliakov A, Rubin EM, Frazer KA, Pachter LS, Dubchak I. VISTA: Visualizing global DNA sequence alignments of arbitrary length. Bioinformatics. 2000;16:1046–1047. doi: 10.1093/bioinformatics/16.11.1046. [DOI] [PubMed] [Google Scholar]
  64. Meng X, Lu X, Li Z, Green ED, Massa H, Trask BJ, Morris CA, Keating MT. Complete physical map of the common deletion region in Williams syndrome and identification and characterization of three novel genes. Hum Genet. 1998a;103:590–599. doi: 10.1007/s004390050874. [DOI] [PubMed] [Google Scholar]
  65. Meng X, Lu X, Morris CA, Keating MT. A novel human gene FKBP6 is deleted in Williams syndrome. Genomics. 1998b;52:130–137. doi: 10.1006/geno.1998.5412. [DOI] [PubMed] [Google Scholar]
  66. Mervis CB, Klein-Tasman BP. Williams syndrome: Cognition, personality, and adaptive behavior. Ment Retard Dev Disabil Res Rev. 2000;6:148–158. doi: 10.1002/1098-2779(2000)6:2<148::AID-MRDD10>3.0.CO;2-T. [DOI] [PubMed] [Google Scholar]
  67. Mervis CB, Robinson BF, Pani JR. Cognitive and behavioral genetics '99: Visuospatial construction. Am J Hum Genet. 1999;65:1222–1229. doi: 10.1086/302633. [DOI] [PMC free article] [PubMed] [Google Scholar]
  68. Miller W. So many genomes, so little time. Nat Biotechnol. 2000;18:148–149. doi: 10.1038/72588. [DOI] [PubMed] [Google Scholar]
  69. Morris CA, Mervis CB. Williams syndrome and related disorders. Annu Rev Genomics Hum Genet. 2000;1:461–484. doi: 10.1146/annurev.genom.1.1.461. [DOI] [PubMed] [Google Scholar]
  70. Morris CA, Demsey SA, Leonard CO, Dilts C, Blackburn BL. Natural history of Williams syndrome: Physical characteristics. J Pediatr. 1988;113:318–326. doi: 10.1016/s0022-3476(88)80272-5. [DOI] [PubMed] [Google Scholar]
  71. Nakayama T, Matsuoka R, Kimura M, Hirota H, Mikoshiba K, Shimizu Y, Shimizu N, Akagawa K. Hemizygous deletion of the HPC-1/syntaxin 1A gene (STX1A) in patients with Williams syndrome. Cytogenet Cell Genet. 1998;82:49–51. doi: 10.1159/000015063. [DOI] [PubMed] [Google Scholar]
  72. Oeltjen JC, Malley TM, Muzny DM, Miller W, Gibbs RA, Belmont JW. Large-scale comparative sequence analysis of the human and murine Bruton's tyrosine kinase loci reveals conserved regulatory domains. Genome Res. 1997;7:315–329. doi: 10.1101/gr.7.4.315. [DOI] [PubMed] [Google Scholar]
  73. Onyango P, Miller W, Lehoczky J, Leung CT, Birren B, Wheelan S, Dewar K, Feinberg AP. Sequence and comparative analysis of the mouse 1-megabase region orthologous to the human 11p15 imprinted domain. Genome Res. 2000;10:1697–1710. doi: 10.1101/gr.161800. [DOI] [PubMed] [Google Scholar]
  74. Osborne LR. Williams-Beuren syndrome: Unraveling the mysteries of a microdeletion disorder. Molec Genet Metab. 1999;67:1–10. doi: 10.1006/mgme.1999.2844. [DOI] [PubMed] [Google Scholar]
  75. Osborne L, Pober B. Genetics of childhood disorders: XXVII. genes and cognition in Williams syndrome. J Am Acad Child Adolesc Psychiatry. 2001;40:732–735. doi: 10.1097/00004583-200106000-00021. [DOI] [PubMed] [Google Scholar]
  76. Osborne LR, Martindale D, Scherer SW, Shi X-M, Huizenga J, Heng HHQ, Costa T, Pober B, Lew L, Brinkman J, et al. Identification of genes from a 500-kb region at 7q11.23 that is commonly deleted in Williams syndrome patients. Genomics. 1996;36:328–336. doi: 10.1006/geno.1996.0469. [DOI] [PubMed] [Google Scholar]
  77. Osborne LR, Herbrick J-A, Greavette T, Heng HHQ, Tsui L-C, Scherer SW. PMS2-related genes flank the rearrangement breakpoints associated with Williams syndrome and other diseases on human chromosome 7. Genomics. 1997a;45:402–406. doi: 10.1006/geno.1997.4923. [DOI] [PubMed] [Google Scholar]
  78. Osborne LR, Soder S, Shi X-M, Pober B, Costa T, Scherer SW, Tsui L-C. Hemizygous deletion of the syntaxin 1A gene in individuals with Williams syndrome. Am J Hum Genet. 1997b;61:449–452. doi: 10.1086/514850. [DOI] [PMC free article] [PubMed] [Google Scholar]
  79. Osborne LR, Campbell T, Daradich A, Scherer SW, Tsui L-C. Identification of a putative transcription factor gene (WBSCR11) that is commonly deleted in Williams-Beuren syndrome. Genomics. 1999;57:279–284. doi: 10.1006/geno.1999.5784. [DOI] [PubMed] [Google Scholar]
  80. Oshima A, Kyle JW, Miller RD, Hoffmann JW, Powell PP, Grubb JH, Sly WS, Tropak M, Guise KS, Gravel RA. Cloning, sequencing, and expression of cDNA for human beta-glucuronidase. Proc Natl Acad Sci. 1987;84:685–689. doi: 10.1073/pnas.84.3.685. [DOI] [PMC free article] [PubMed] [Google Scholar]
  81. Osoegawa K, Tateno M, Woon PY, Frengen E, Mammoser AG, Catanese JJ, Hayashizaki Y, de Jong PJ. Bacterial artificial chromosome libraries for mouse sequencing and functional analysis. Genome Res. 2000;10:116–128. [PMC free article] [PubMed] [Google Scholar]
  82. Paperna T, Peoples R, Wang Y-K, Kaplan P, Francke U. Genes for the CPE receptor (CPETR1) and the human homolog of RVP1 (CPETR2) are localized within the Williams-Beuren syndrome deletion. Genomics. 1998;54:453–459. doi: 10.1006/geno.1998.5619. [DOI] [PubMed] [Google Scholar]
  83. Pennacchio LA, Rubin EM. Genomic strategies to identify mammalian regulatory sequences. Nat Rev Genet. 2001;2:100–109. doi: 10.1038/35052548. [DOI] [PubMed] [Google Scholar]
  84. Peoples R, Perez-Jurado L, Wang Y-K, Kaplan P, Francke U. The gene for replication factor C subunit 2 (RFC2) is within the 7q11.23 Williams syndrome deletion. Am J Hum Genet. 1996;58:1370–1373. [PMC free article] [PubMed] [Google Scholar]
  85. Peoples RJ, Cisco MJ, Kaplan P, Francke U. Identification of the WBSCR9 gene, encoding a novel transcriptional regulator, in the Williams-Beuren syndrome deletion at 7q11.23. Cytogenet Cell Genet. 1998;82:238–246. doi: 10.1159/000015110. [DOI] [PubMed] [Google Scholar]
  86. Peoples R, Franke Y, Wang Y-K, Perez-Jurado L, Paperna T, Cisco M, Francke U. A physical map, including a BAC/PAC clone contig, of the Williams-Beuren syndrome-deletion region at 7q11.23. Am J Hum Genet. 2000;66:47–68. doi: 10.1086/302722. [DOI] [PMC free article] [PubMed] [Google Scholar]
  87. Perez Jurado LA, Peoples R, Kaplan P, Hamel BCJ, Francke U. Molecular definition of the chromosome 7 deletion in Williams syndrome and parent-of-origin effects on growth. Am J Hum Genet. 1996;59:781–792. [PMC free article] [PubMed] [Google Scholar]
  88. Perez Jurado LA, Wang Y-K, Peoples R, Coloma A, Cruces J, Francke U. A duplicated gene in the breakpoint regions of the 7q11.23 Williams-Beuren syndrome deletion encodes the initiator binding protein TFII-1 and BAP-135, a phosphorylation target of BTK. Hum Mol Genet. 1998;7:325–334. doi: 10.1093/hmg/7.3.325. [DOI] [PubMed] [Google Scholar]
  89. Perez Jurado LA, Wang Y-K, Francke U, Cruces J. TBL2, a novel transducin family member in the WBS deletion: Characterization of the complete sequence, genomic structure, transcriptional variants and the mouse ortholog. Cytogenet Cell Genet. 1999;86:277–284. doi: 10.1159/000015319. [DOI] [PubMed] [Google Scholar]
  90. Pletcher MT, Wiltshire T, Cabin DE, Villanueva M, Reeves RH. Use of comparative physical and sequence mapping to annotate mouse chromosome 16 and human chromosome 21. Genomics. 2001;74:45–54. doi: 10.1006/geno.2001.6533. [DOI] [PubMed] [Google Scholar]
  91. Reichwald K, Thiesen J, Wiehe T, Weitzel J, Stratling WH, Kioschis P, Poustka A, Rosenthal A, Platzer M. Comparative sequence analysis of the MECP2-locus in human and mouse reveals new transcribed regions. Mamm Genome. 2000;11:182–190. doi: 10.1007/s003350010035. [DOI] [PubMed] [Google Scholar]
  92. Robinson WP, Waslynka J, Bernasconi F, Wang M, Clark D, Kotzot D, Schinzel A. Delineation of 7q11.2 deletions associated with Williams-Beuren syndrome and mapping of a repetitive sequence to within and to either side of the common deletion. Genomics. 1996;34:17–23. doi: 10.1006/geno.1996.0237. [DOI] [PubMed] [Google Scholar]
  93. Scherer SW, Neufeld EJ, Lievens PM, Orkin SH, Kim J, Tsui L-C. Regional localization of the CCAAT displacement protein gene (CUTL1) to 7q22 by analysis of somatic cell hybrids. Genomics. 1993;15:695–696. doi: 10.1006/geno.1993.1130. [DOI] [PubMed] [Google Scholar]
  94. Schwartz S, Zhang Z, Frazer KA, Smit A, Riemer C, Bouck J, Gibbs R, Hardison R, Miller W. PipMaker—A web server for aligning two genomic DNA sequences. Genome Res. 2000;10:577–586. doi: 10.1101/gr.10.4.577. [DOI] [PMC free article] [PubMed] [Google Scholar]
  95. Shephard EA, Palmer CN, Segall HJ, Phillips IR. Quantification of cytochrome P450 reductase gene expression in human tissues. Arch Biochem Biophys. 1992;294:168–172. doi: 10.1016/0003-9861(92)90152-m. [DOI] [PubMed] [Google Scholar]
  96. Shiraishi T, Druck T, Mimori K, Flomenberg J, Berk L, Alder H, Miller W, Huebner K, Croce CM. Sequence conservation at human and mouse orthologous common fragile regions, FRA3B/FHIT and Fra14A2/Fhit. Proc Natl Acad Sci. 2001;98:5722–5727. doi: 10.1073/pnas.091095898. [DOI] [PMC free article] [PubMed] [Google Scholar]
  97. Shizuya H, Birren B, Kim U-J, Mancino V, Slepak T, Tachiiri Y, Simon M. Cloning and stable maintenance of 300-kilobase-pair fragments of human DNA in Escherichia coli using an F-factor-based vector. Proc Natl Acad Sci. 1992;89:8794–8797. doi: 10.1073/pnas.89.18.8794. [DOI] [PMC free article] [PubMed] [Google Scholar]
  98. Soderlund C, Longden I, Mott R. FPC: A system for building contigs from restriction fingerprinted clones. Comput Appl Biosci. 1997;13:523–535. doi: 10.1093/bioinformatics/13.5.523. [DOI] [PubMed] [Google Scholar]
  99. Soderlund C, Humphray S, Dunham A, French L. Contigs built with fingerprints, markers, and FPC V4.7. Genome Res. 2000;10:1772–1787. doi: 10.1101/gr.gr-1375r. [DOI] [PMC free article] [PubMed] [Google Scholar]
  100. Stojanovic N, Florea L, Riemer C, Gumucio D, Slightom J, Goodman M, Miller W, Hardison R. Comparison of five methods for finding conserved sequences in multiple alignments of gene regulatory regions. Nucleic Acids Res. 1999;27:3899–3910. doi: 10.1093/nar/27.19.3899. [DOI] [PMC free article] [PubMed] [Google Scholar]
  101. Tassabehji M, Metcalfe K, Fergusson WD, Carette MJA, Dore JK, Donnai D, Read AP, Proschel C, Gutowski NJ, Mao X, et al. LIM-kinase deleted in Williams syndrome. Nat Genet. 1996;13:272–273. doi: 10.1038/ng0796-272. [DOI] [PubMed] [Google Scholar]
  102. Tassabehji M, Carette M, Wilmot C, Donnai D, Read AP, Metcalfe K. A transcription factor involved in skeletal muscle gene expression is deleted in patients with Williams syndrome. Eur J Hum Genet. 1999;7:737–747. doi: 10.1038/sj.ejhg.5200396. [DOI] [PubMed] [Google Scholar]
  103. Thomas JW, Summers TJ, Lee-Lin S-Q, Braden Maduro VV, Idol JR, Mastrian SD, Ryan JF, Jamison DC, Green ED. Comparative genome mapping in the sequence-based era: Early experience with human chromosome 7. Genome Res. 2000;10:624–633. doi: 10.1101/gr.10.5.624. [DOI] [PMC free article] [PubMed] [Google Scholar]
  104. Todd S, McGill JR, McCombs JL, Moore CM, Weider I, Naylor SL. cDNA sequence, interspecies comparison, and gene mapping analysis of argininosuccinate lyase. Genomics. 1989;4:53–59. doi: 10.1016/0888-7543(89)90314-5. [DOI] [PubMed] [Google Scholar]
  105. Touchman JW, Anikster Y, Dietrich NL, Braden Maduro VV, McDowell G, Shotelersuk V, Bouffard GG, Beckstrom-Sternberg SM, Gahl WA, Green ED. The genomic region encompassing the nephropathic cystinosis gene (CTNS): Complete sequencing of a 200-kb segment and discovery of a novel gene within the common cystinosis-causing deletion. Genome Res. 2000;10:165–173. doi: 10.1101/gr.10.2.165. [DOI] [PMC free article] [PubMed] [Google Scholar]
  106. Touchman JW, Dehejia A, Chiba-Falek O, Cabin DE, Schwartz JR, Orrison BM, Polymeropoulos MH, Nussbaum RL. Human and mouse α-synuclein genes: Comparative genomic sequence analysis and identification of a novel gene regulatory element. Genome Res. 2001;11:78–86. doi: 10.1101/gr.165801. [DOI] [PMC free article] [PubMed] [Google Scholar]
  107. Valero MC, de Luis O, Cruces J, Perez Jurado LA. Fine-scale comparative mapping of the human 7q11.23 region and the orthologous region on mouse chromosome 5G: The low-copy repeats that flank the Williams-Beuren syndrome deletion arose at breakpoint sites of an evolutionary inversion(s) Genomics. 2000;69:1–13. doi: 10.1006/geno.2000.6312. [DOI] [PubMed] [Google Scholar]
  108. van Duin M, Polman JE, Verkoelen CC, Bunschoten H, Meyerink JH, Olijve W, Aitken RJ. Cloning and characterization of the human sperm receptor ligand ZP3: Evidence for a second polymorphic allele with a different frequency in the Caucasian and Japanese populations. Genomics. 1992;14:1064–1070. doi: 10.1016/s0888-7543(05)80130-2. [DOI] [PubMed] [Google Scholar]
  109. Venter JC, Adams MD, Myers EW, Li PW, Mural RJ, Sutton GG, Smith HO, Yandell M, Evans CA, Holt RA, et al. The sequence of the human genome. Science. 2001;291:1304–1351. doi: 10.1126/science.1058040. [DOI] [PubMed] [Google Scholar]
  110. Wang Y-K, Harryman Samos C, Peoples R, Perez-Jurado LA, Nusse R, Francke U. A novel human homologue of the Drosophila frizzled wnt receptor gene binds wingless protein and is in the Williams syndrome deletion at 7q11.23. Hum Mol Genet. 1997;6:465–472. doi: 10.1093/hmg/6.3.465. [DOI] [PubMed] [Google Scholar]
  111. Wang Y-K, Perez-Jurado LA, Francke U. A mouse single-copy gene, Gtf2i, the homolog of human GTF2I, that is duplicated in the Williams-Beuren syndrome deletion region. Genomics. 1998;48:163–170. doi: 10.1006/geno.1997.5182. [DOI] [PubMed] [Google Scholar]
  112. Wang Y-K, Sporle R, Paperna T, Schughart K, Francke U. Characterization and expression pattern of the frizzled gene Fzd9, the mouse homolog of FZD9 which is deleted in Williams-Beuren syndrome. Genomics. 1999;57:235–248. doi: 10.1006/geno.1999.5773. [DOI] [PubMed] [Google Scholar]
  113. Wasserman WW, Palumbo M, Thompson W, Fickett JW, Lawrence CE. Human–mouse genome comparisons to locate regulatory sites. Nature Genet. 2000;26:225–228. doi: 10.1038/79965. [DOI] [PubMed] [Google Scholar]
  114. Wedemeyer N, Peoples R, Himmelbauer H, Lehrach H, Francke U, Wanker EE. Localization of the human HIP1 gene close to the elastin (ELN) locus on 7q11.23. Genomics. 1997;46:313–315. doi: 10.1006/geno.1997.5027. [DOI] [PubMed] [Google Scholar]
  115. Wilson MD, Riember C, Martindale DW, Schnupf P, Boright AP, Cheung TL, Hardy DM, Schwartz S, Scherer SW, Tsui L-C, et al. Comparative analysis of the gene-dense ACHE/TFR2 region on human chromosome 7q22 with the orthologous region on mouse chromosome 5. Nucleic Acids Res. 2001;29:1352–1365. doi: 10.1093/nar/29.6.1352. [DOI] [PMC free article] [PubMed] [Google Scholar]
  116. Wilson RK, Mardis ER. Shotgun sequencing. In: Birren B, et al., editors. Genome analysis: A laboratory manual. Analyzing DNA. Vol. 1. Cold Spring Harbor, New York: Cold Spring Harbor Laboratory Press; 1997. pp. 397–454. [Google Scholar]
  117. Wu Q, Zhang T, Cheng J-F, Kim Y, Grimwood J, Schmutz J, Dickson M, Noonan JP, Zhang MQ, Myers RM, et al. Comparative DNA sequence analysis of mouse and human protocadherin gene clusters. Genome Res. 2001;11:389–404. doi: 10.1101/gr.167301. [DOI] [PMC free article] [PubMed] [Google Scholar]
  118. Wydner KS, Sechler JL, Boyd CD, Passmore HC. Use of an intron length polymorphism to localize the tropoelastin gene to mouse chromosome 5 in a region of linkage conservation with human chromosome 7. Genomics. 1994;23:125–131. doi: 10.1006/geno.1994.1467. [DOI] [PubMed] [Google Scholar]
  119. Yan X, Zhao X, Qian M, Guo N, Gong X, Zhu X. Characterization and gene structure of a novel retinoblastoma-protein-associated protein similar to the transcription regulator TFII-I. Biochem J. 2000;345:749–757. [PMC free article] [PubMed] [Google Scholar]
  120. Younger RM, Amadou C, Bethel G, Ehlers A, Lindahl KF, Forbes S, Horton R, Milne S, Mungall AJ, Trowsdale J, et al. Characterization of clustered MHC-linked olfactory receptor genes in human and mouse. Genome Res. 2001;10:519–530. doi: 10.1101/gr.160301. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Genome Research are provided here courtesy of Cold Spring Harbor Laboratory Press

RESOURCES