Cheng and Nash. 10.1073/pnas.0708368104.

Supporting Information

Files in this Data Supplement:

SI Figure 5
SI Figure 6
SI Materials and Methods




SI Figure 5

Fig. 5. Properties of INAF-B. (A) Apparent molecular mass. Extracts of heads from Canton-S flies (a) or flies bearing a transgene in which the ORF for INAF-B was modified by addition of an HA tag (b) were electrophoresed and probed with anti-HA, as described in the main text. Note the specificity of the signal and the apparent molecular mass of ~10.5 kDa, judged by comparison with the molecular mass markers (SeeBlue Plus2 prestained standard; Invitrogen, Carlsbad, CA) shown at the left. (B) Membrane association. Heads from flies bearing a transgene with the tagged version of INAF-B were homogenized and centrifuged. The resulting membrane pellet was extracted as described in SI Materials and Methods to distinguish peripheral membrane proteins from integral membrane proteins, and aliquots of the two fractions were electrophoresed, blotted, and probed with anti-HA. The procedure was performed on two independent occasions, with samples a and b from the first run and c-e from the second run. The type and amount of material (in fly head equivalents) loaded on each lane were as follows: (a) peripheral membrane fraction (PMF), 15 heads; (b) integral membrane fraction (IMF), 15 heads; (c) PMF, 15 heads; (d) IMF, 15 heads; (e) IMF, 5 heads. Note the preponderance of signal in the IMF. (C) Dependence on trp function. Strains were constructed to carry one copy of the tagged version of INAF-B and to be homozygous for the wild-type trp gene (a) the loss-of-function allele, trpP301 used in Fig. 3 (b), or the loss-of-function allele, trpP343 (c). Extracts of heads from these three strains were electrophoresed and blotted as above. Note the virtually complete loss of signal from the tagged INAF-B in the trp mutants.





SI Figure 6

Fig. 6. Sequence of representative proteins that contain the InaF motif. The inaF motif of each representative is highlighted in bold and red. In those cases in which a currently active annotation exists, an accession number from the NCBI protein database (www.ncbi.nlm.nih.gov) is given after the species designation and the trivial name assigned to the representative in this work. In those cases in which the protein is not actively annotated (second entry for mosquito, mouse, and human), the accession number of the reference genome sequence that contains the ORF is given.





SI Materials and Methods

Construction of Mutated inaF Transgenes

Substitution mutations in

inaF-B.
Plasmid pP[inaFG-B_HA] was modified by the recombineering protocol described in the main text to make three sets of nonconservative substitutions in motif residues of INAF-B. In the first case, m1, a double mutation (K35E and R38E) substitutes two acidic residues for two basic residues that are positioned N-terminal to the TM domain. The changes were introduced simultaneously using the following two oligonucleotides: 5'-ggagatcccgatgcccaaatcgaatgacttcttcgagtctgaaaccttcgagctgctc-3' and 5-'tgccgctaacgccgcccatgtagagcaacagggtgagcagctcgaaggtttcag-3'. In the second case, m2, a triple mutation (Y45D, M46D, V49D) substitutes acidic residues for three hydrophobic residues of the TM domain. The changes were introduced simultaneously using the following oligonucleotides: 5'-cttcgagtctaagaccttccgcctgctcaccctgttgctcgatgacggcggcgacagcgg-3' and 5'-tgaacaggtagtagacagccagagtcaagcccatgccgctgtcgccgccgtcatc-3'. In the third case, m3, a triple mutation (I63K, W64K, D65A) substitutes two basic residues for hydrophobic residues and a nonpolar residue for an acidic residue in the region just downstream of the TM domain. The changes were introduced simultaneously using the following oligonucleotides: 5'-ttagcggcatgggcttgactctggctgtctactacctgttcaaaaaggcc-3' and 5'-ggatgcgtgtgcttgaacacggcagcggcggcatgcgtgaggcctttttgaacaggtagt-3'

Frameshift mutation in

inaF-B.
Plasmid pP[inaFG] was digested with RsrII and ligated to a double-stranded adaptor, made by annealing two oligonucleotides (5'-gacgtcccaacatgcc-3' and 5'-gtcggcatgttgggac-3'). This results in a frameshift mutation, m4, at the fourth codon of the inaF-B ORF; the consequent alteration in coding sequence is shown in the main text.

Frameshift mutation in 241-aa ORF.

Plasmid pP[inaFG] was modified by recombineering as described above with the following two oligonucleotides: 5'-atcctcgagcagtaccgcatcctcagccacatgcaacagcggccggccagcaactgctgcagcgccaacatctccaactgcagca-3' and 5'-tgctgcagttggagatgttggcgctgcagcagttgctggccggccgctgttgcatgtggctgaggatgcgtactgctcgaggat-3'. This results in a frameshift mutation, m5, at the fourth codon of the 241-aa ORF that has been proposed as the protein (AAF48084) encoded by CG2457. As a consequence, the coding sequence is changed from: MQQQRQQLLQ ... 221 aa ....FLTGELIFEK* to

MQQRPASNCCSANISNCSSWRQTIASRRSLPRPPSFRHIRIPIHIPGSRPRSRF*. For each of the constructs described above, DNA sequencing of the final plasmid was used to verify the fidelity of the introduced features.

Subcellular Fractionation of INAF.

We slightly modified a procedure designed to distinguish soluble, peripheral membrane, and integral membrane proteins (1). Briefly, heads from ~150 adult flies bearing a P[inaFG-B_HA] transgene were collected on dry ice and homogenized on ice in 400 ml of 20 mM Tris (pH 7.8) containing 0.1 mM PMSF and a mixture of protease inhibitors (Roche, Indianapolis, IN). The homogenate was centrifuged twice at 4°C for 10 min at 1,000 ´ g. The resulting supernatant was centrifuged at 100,000 ´ g for 30 min to yield a soluble fraction and a membrane pellet. The latter (which contained almost all of the HA-tagged material) was vigorously resuspended in 100 mM sodium carbonate (pH 11.5) and, after 30 min on ice, was centrifuged as above to yield a pellet with integral membrane proteins and a supernatant with peripheral membrane proteins. The latter was adjusted to 3-4% trichloracetic acid, and the resulting precipitate was pelleted by brief centrifugation and washed with acetone. Both pellets were resuspended in SDS loading buffer and boiled for 2 min.

In Silico

Detection of Proteins with the inaF Motif.
To find close relatives in Drosophila mojavensis, its genome sequence (Agencourt Bioscience Corp., Beverly, MA) was queried with the full-length polypeptide sequence of the four melanogaster INAF proteins, using BLAT (2) on the UCSC Genome Bioinformatics site (http://genome.ucsc.edu). In each case, examination of the resulting alignment plus the coding potential of the flanking sequence yielded a clear end-to-end ortholog. The deduced amino acid sequences are given in Fig. 6, as are all of the motif-containing proteins discussed below.

To find more distant relatives that might share only the inaF motif, the signature 32-aa stretch of the Drosophila inaF-D polypeptide (Fig. 4B) was used to search the nonredundant protein database at NCBI (www.ncbi.nlm.nih.gov/blast/index.shtml) with BLASTP (3). This revealed highly significant (E-value <0.001) hits on segments of hypothetical proteins predicted from the genomic sequences of Anopheles gambiae (XP_311045) and Caenorhabditis elegans (NP_505782). Although only a single protein from each organism was identified by this strategy, in the case of the mosquito, scrutiny of the region flanking the hit revealed an unannotated ORF with a recognizable inaF motif. The BLASTP search also turned up, albeit with poorer confidence (E-values <0.025), hypothetical proteins from zebrafish (AAI22165), chicken (XP_001232207), mouse (XP_993763), and human (EAW57469). When the human protein (which is encoded by an intronless stretch of chromosome 19q32) and the mouse proteins were compared (www.ncbi.nlm.nih.gov/blast/bl2seq), except for an N-terminal extension of 53 aa in the predicted human ortholog, they aligned quite well, not just over the inaF motif, but over their entire length. Similarly, the chicken and zebrafish proteins aligned well over a stretch of 60 aa. In contrast, only the inaF motif was common to the human/mouse and zebrafish/chicken proteins. However, a TBLASTN search of the human reference genomic sequence with the full-length zebrafish protein turned up another ORF with the inaF motif. This 154-aa protein (which is encoded by an intronless stretch of chromosome 15q15) has a block of 60 aa at its N terminus that almost perfectly matches the sequence found in the inaF motif proteins of the zebrafish and chicken. The mouse genome encodes a 152-aa ORF that is very similar over its entire length to the human chromosome 15 protein. It thus appears that, in contrast to non-mammalian vertebrates, the genome of mammals encodes two proteins with the inaF motif; the evolutionary origin of this difference is unclear.

It should be noted that the significance of the match of the non-insect ORFs to the inaF motif was confirmed using a Hidden Markov model (www.hmmer.wustl.edu). To build the model, we first aligned the sequences encompassing the 32-aa segment from the isoform D orthologs of selected insects (D. melanogaster, D. mojavensis, D. grimshawi, and A. gambiae). An HMM (4) based on this alignment (constructed with the GCG suite of programs) was then used to search organism-specific segments of the Uniprot database (www.pir.uniprot.org) that had been downloaded to the GCG platform. Because several of the motif-containing ORFs were missing from this database, they were added manually. For each organism examined, the inaF-motif proteins described in the above paragraph and shown in Fig. 6 not only gave very significant values (E <0.000001) but were the only hits with E-values <1. It should also be noted that none of the genes discovered in our search has been functionally annotated, but almost every one of them is well represented in EST databases.

1. Stamnes MA, Shieh BH, Chuman L, Harris GL, Zuker CS (1991) Cell 65:219-227.

2. Kent WJ (2002) Genome Res 12:656-664.

3. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ (1990) J Mol Biol 215:403-410.

4. Eddy SR (1998) Bioinformatics 14:755-763.