Supporting Data
Fine Mapping of the t(11;12)(q23;q24). To fine map the t(11;12)(q24;q23) breakpoint, we used the sequence obtained from BAC44d21 to generate primers designed to amplify PCR products at 10-kb intervals across this genomic region (Fig. 6). The somatic cell hybrid KCI 221 CL1, a clone found to cytogenetically contain only the human der(12) chromosome and lack both the der(11) and normal human chromosome 11 and 12, was used in the analysis. Positive and negative PCR reactions reflected chromosome 11q23-q24 material fused to the der(12) and retained on the der(11) chromosome, respectively. Amplicons at every 0.5 kb within the 5'-BRKPT.10-5'-BRKPT.11 interval were designed to further narrow this interval to within 0.8 kb centromeric of the 5'-BRKPT.11 PCR product (Fig. 6).
BCSC-1 Splice Variants and Protein Isoforms. The major isoform BCSC-1(f) is a 3.4-kb variant that includes all identified exons, with an open reading frame of 2,361 bp encoding a protein of 786 aa with a predicted molecular mass of 86 kDa. The isoform BCSC-1(c) differs from BCSC-1(f) because it lacks the noncoding exon 1b. The BCSC-1(d) is a 1.9-kb splice variant that contains exons 1 through 10 with an open reading frame of 1,248 bp encoding for a predicted protein of 415 aa and molecular mass of 46 kDa whereas BCSC-1(a) differs from BCSC-1(d) by lacking the noncoding exon 1b. The BCSC-1(e) is a 2.0-kb transcript that lacks exons 8 through 16, contains an 801-bp open reading frame, and encodes for a 266-aa protein with a predicted molecular mass of 29 kDa. The isoform BCSC-1(b) differs from BCSC-1(e) by lacking the noncoding exon 1b whereas the additional variant BCSC-1(g) was discovered by Gentile and colleagues (12). Based on the original nucleotide sequence AF002672, they identified a deletion in the BCSC-1 gene in 7 of 34 (20%) early-onset breast cancer cases. Relative to the genomic and transcription maps of BCSC-1 presented here, these deletions would splice exons 10-14, resulting in a 1.65-kb mRNA with a 1,218-bp open reading frame encoding for a 426-aa protein with a predicted molecular mass of 47 kDa (Fig. 8). It remains to be seen whether this is a tumor-specific variant of the BCSC-1 gene.
Cloning of the Mouse bcsc-1 Gene. We screened the mouse dbEST database with the human full-length BCSC-1 cDNA sequence by blastn and blastx (www.ncbi.nlm.nih.gov) to identify a murine homologue of BCSC-1. blast searches retrieved three partially sequenced EST clones, AA242092, AI121130, and AI414583, with identities to the BCSC-1 nucleotide sequence of 83%, 90%, and 90%, respectively. Sequencing of image clones corresponding to each of the ESTs followed by 5' and 3' RACE using primers derived from internal EST sequences generated 1,022 kb of overlapping cDNA sequence. This partial cDNA sequence contained 138 bp of 5' the UTR and a continuous and nonterminating 884-bp open reading frame with an 80% nucleotide identity and 76%/86% protein identity/similarity to the human BCSC-1 gene (Fig. 9). The consensus sequence of this partial cDNA was used for blast comparisons to ascertain a mouse UNIGENE cluster Mm.34060, which consisted of 144 ESTs from multiple tissue libraries. 131 of those partial EST sequences enabled us to extend our cDNA contig in both the 5' and 3' directions for a 3.2-kb contiguous transcript. Isolation of the full-length homologue was obtained by 5' and 3' RACE experiments using mouse spleen, kidney, and testis cDNA extending the cDNA sequence to 4.121 kb. All sequences were verified by RT-PCR and direct sequencing of various normal mouse cDNAs (data not shown). Northern blot analysis of a panel of mouse tissues was performed to confirm the existence of the predicted full-length cDNA and exclude the possibility of any splicing variants (Fig. 9). The murine bcsc-1 gene appears to be expressed as a single abundant transcript in heart, lung, liver, and kidney, at intermediate levels in brain and spleen, and at low levels in both muscle and testis. Although expected because of the lack of heterogeneity in the assembled EST and RACE contigs, the bcsc-1 gene does not generate multiple mRNA isoforms as in human.
Sequence analysis of the full-length cDNA revealed an open reading frame of 2.4 kb encoding for a protein consisting of 793 aa and a predicted molecular mass of 87 kDa (Fig. 9). A pairwise blastp comparison demonstrated that the murine protein shares a 67% identity and a 78% similarity to the human BCSC-1 protein. Like its human counterpart, the predicted murine protein lacks significant homology to proteins of known function and possesses two conserved domains identified by the Conserved Domain Database (www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtml). These domains, the Vault protein Inter-alpha-Trypsin and the vonWillebrand Type A domain, align to amino acids at positions 1-130 and 282-443, respectively.