Skip to main content
. 2003 Winter;2:233–247. doi: 10.1187/cbe.03-06-0026

Table 5.

Computational biology exercises and projects for developing bioinformatics skills

Training exercises
  1. Annotation of a genomic DNA sequence. Students were given the human H-Ras genomic sequence and asked to use on-line tools to identify as many of the following features as possible: intron–exon boundaries, promoter or transcription factor binding sites, transcription initiation and termination sites, polyadenylation sites, polymorphisms found in human populations, sites of oncogenic mutations, splicing acceptor and donor sites, repetitive DNA elements within this segment of DNA, and low-complexity segments. The students were asked to use a search engine such as Google to find the appropriate tool on the Internet. One goal of this assignment was to show them that a large number of specialized bioinformatics tools could be found on the Web. The other goal was to give them practice with the kind of analyses they would carry out in their group research project.

  2. Compare prokaryotic and eukaryotic gene organization and gene identification. See the text for a description of this exercise.

  3. Compare the success of various programs for prediction of transmembrane segments in protein sequences (using both “knowns” and “unknowns”). See the text for a description of this exercise.

  4. Comparative analysis of the domain structure of actin cross-linking proteins. Using tools like Conserved Domain Database at NCBI, students were asked to compare and contrast the modular organization of actin cross-linking proteins such as fimbrin/plastin, α-actinin, β-spectrin, and dystrophin, with a view to understanding differences in their structure and function within the actin cytoskeleton.

Group research project
Sequence and structural analysis of a large, multidomain protein (human ryanodine receptor genes/proteins). See the text for a description of this exercise.
Group programming project
Development of a Web-based Perl program that determines the amino acid composition of any protein sequence in FASTA format. See the text for a description of this exercise.
Software required: Web Browser, RasMol, TreeView
Useful Web sites: Most of the sites listed in Table 2