Supporting Figure 2

Fig. 2.

Frequency distribution of E values in tblastn searches with the Arabidopsis thaliana predicted protein data set. Sequence similarity searches were performed with tblastn using the A. thaliana predicted proteins, which were extracted from RefSeq database at the National Center for Biotechnology Information, as the query sequences against the Physcomitrella patens EST data set. The E value of the best hit from each search was recorded. The E values are assembled and shown as histograms. The E values are in logarithmic scale (horizontal axis) and the frequencies are in linear scale (vertical axis). Significant similarities (E ≤ 0.001) are colored gray, whereas data with values of E > 0.001 are shown in white.