Supporting information for Berman et al. (2002) Proc. Natl. Acad. Sci. USA 99 (2), 757762. (10.1073/pnas.231608898)
Fig. 5.
Binding site sequences and PWMs. Binding sites for Bcd, Cad, Hb, Kr, and Kni based on DNase protection were collected from the literature (aided greatly by a previous collection from ref. 1). Sequences for Bcd, Hb, and Kr were aligned using the pattern discovery tool MEME (ver. 3.0; ref. 21) with the following command line settings: "-mod zoops -revcomp -dna." The "-minsites" parameter was set to 80% of the total number of sites collected for each transcription factor. This parameter allowed for up to 20% of binding site sequences that aligned poorly to be omitted as potential sources of experimental error. For Bcd, 51/51 sites were aligned; for Hb, 93/93 sites were aligned; for Kr, 29/37 sites were aligned. A -bfile or background model file was used that included mono-nucleotide, di-nucleotide, and tri-nucleotide frequencies determined from the intergenic D. melanogaster genomic sequence as annotated in Berkeley Drosophila Genome Project (BDGP)/Celera Release 1. Individual binding site sequences for Cad and Kni were aligned manually as described in source publications. Sequence Logos (2) were constructed using the tool available at http://www.bio.cam.ac.uk/seqlogo/. Additional information on sequences is available at http://www.fruitfly.org/cis-analyst.