Skip to main content
Nucleic Acids Research logoLink to Nucleic Acids Research
. 1998 Oct 15;26(20):4748–4757. doi: 10.1093/nar/26.20.4748

Prediction of locally optimal splice sites in plant pre-mRNA with applications to gene identification in Arabidopsis thaliana genomic DNA.

V Brendel 1, J Kleffe 1
PMCID: PMC147908  PMID: 9753745

Abstract

Prediction of splice site selection and efficiency from sequence inspection is of fundamental interest (testing the current knowledge of requisite sequence features) and practical importance (genome annotation, design of mutant or transgenic organisms). In plants, the dominant variables affecting splice site selection and efficiency include the degree of matching to the extended splice site consensus and the local gradient of U- and G+C-composition (introns being U-rich and exons G+C-rich). We present a novel method for splice site prediction, which was particularly trained for maize and Arabidopsis thaliana. The method extends our previous algorithm based on logitlinear models by considering three variables simultaneously: intrinsic splice site strength, local optimality and fit with respect to the overall splice pattern prediction. We show that the method considerably improves prediction specificity without compromising the high degree of sensitivity required in gene prediction algorithms. Applications to gene identification are illustrated for Arabidopsis and suggest that successful methods must combine scoring for splice sites, coding potential and similarity with potential homologs in non-trivial ways. A WWW version of the SplicePredictor program is available at http:/gnomic.stanford.edu/volker/SplicePredi ctor.html/

Full Text

The Full Text of this article is available as a PDF (230.0 KB).


Articles from Nucleic Acids Research are provided here courtesy of Oxford University Press

RESOURCES