Abstract
Circular genomes, being the largest proportion of sequenced genomes, play an important role in genome analysis. However, traditional 2D circular map only provides an overview and annotations of genome but does not offer feature-based comparison. For remedying these shortcomings, we developed 3D Genome Tuner, a hybrid of circular map and comparative map tools. Its capability of viewing comparisons between multiple circular maps in a 3D space offers great benefits to the study of comparative genomics. The program is freely available (under an LGPL licence) at http://sourceforge.net/projects/dgenometuner.
Key words: three dimensional, circular map, genome browser, comparative genomics
Introduction
Advances in whole-genome sequencing technology have resulted in massive amounts of genomic data. The best way to learn the characteristics of a genome and genetic mutation between genomes is using visualization applications or so-called genome browsers. Table 1 lists a comparison between some browsers. As we can see, modern browsers are generally platform-independent, rich in presentation methods and reading common sequence formats. Furthermore, it is noticeable that linear sequence browsers are usually used in comparisons while circular sequence browsers are used in presentations, mainly because of the absence of space to accommodate multiple genomes in a circular map, and the inconvenience to draw links between circular genomes. Though comparisons can be taken in linear form for circular genomes, the lines linking the head of one genome and the tail of another are remarkably stretched (Figure 1A). However, the same links are far less affected in a circular comparison map (Figure 1B). Recently, we discovered that it is more elegant to have the circles placed in a 3D space, as in this mode it is possible to accommodate more information for each circle and it can be easily viewed by rotating one of the circles (Figure 1C).
Table 1.
Comparison of features of some freely available, stand-alone genome browsers
Program (Reference) | Platform*1 | Input format | Circular view | Linear view | Dot view | Genome comparison*2 | Dimensional |
---|---|---|---|---|---|---|---|
3D Genome Tuner | Java | EMBL, GBK, FASTA, BLAST | √ | √ | √ | 2D/3D | |
ACT (7) | Java | EMBL, GBK, FASTA, GFF | √ | √ | 2D | ||
Bluejay (8) | Java | XML | √ | √ | √ | 2D | |
BugView (9) | Java | GBK | √ | √ (two) | 2D | ||
CGAT (10) | Java (client), Perl (server) | GBK, FASTA | √ | √ | √ (two) | 2D | |
CGView (11) | Java | PTT, XML | √ | 2D | |||
Combo (12) | Java | GBK, FASTA, GFF, GENSCAN | √ | √ | √ (two) | 2D | |
DNAPlotter (13) | Java | EMBL, GBK, FASTA, GFF | √ | √ | 2D | ||
DNAVis (14) | Windows, Linux | GFF, FASTA | √ | √ | √ | 2D/3D | |
GATA (15) | Java | GFF | √ | √ (two) | 2D | ||
GeneVTio (16) | Java | PTT, FNN | √ | √ | 2D | ||
GenoMap (17) | Tcl/Tk | GTF | √ | 2D | |||
Genome2D (18) | Windows | tab delimited | √ | √ | 2D | ||
GenomeComp (19) | Perl/Tk | EMBL, GBK, FASTA | √ | √ (two) | 2D | ||
GenomeMatcher (20) | Mac | GBK, FASTA | √ | √ | √ (two) | 2D | |
GenomePlot (21) | Perl/Tk | tab delimited | √ | √ | 2D | ||
GenomeViz (22) | Tcl/Tk/Perl | tab delimited | √ | √ | 2D | ||
Genomorama (23) | Mac, Linux, Windows | EMBL, GBK, FASTA, PTT, ASN.1 | √ | √ | 2D | ||
Mauve (3) | Java | GBK, SEQ, FASTA | √ | √ | 2D | ||
Microbial Genome Viewer (24) | online | EMBL, GBK, tab delimited | √ | √ | 2D | ||
MUMmerplot (25) | Linux | FASTA | √ | √ (two) | 2D | ||
SeqVISTA (26) | Java | GBK, FASTA | √ | √ | 2D | ||
Sockeye (27) | Java | EMBL, GFF | √ | √ | 3D |
Programs developed using Java, Tcl/Tk or Perl are considered to be platform-independent.
Software with “√ (two)” means that it can only compare two genomes at one time, while those with “√” have no limits.
Figure 1.
From linear view (A) to 2D circular view (B) then 3D circular view (C). The lines linking homologous regions are condensed to offer a clearer comparison.
Taking into consideration that most of the completed sequences, typically short or prokaryotic ones, are circular DNA (bacterial, plasmid, mitochondrial and chloroplast), we developed a new program, called 3D Genome Tuner, which enables comparing circular genomes in a 3D space.
Application
Implementation
3D Genome Tuner is a stand-alone application employing Java Swing and JOGL API (JSR-231 1.1.1). Some open source libraries such as Jaligner and Jfreechart are incorporated in the package. It adopts MVC (Model-View-Controller) architecture and has been tested under Microsoft Windows and Fedora Linux. The program is freely available (under an LGPL licence) at http://sourceforge.net/projects/dgenometuner.
User interface
3D Genome Tuner is an interactive application and provides real-time visualization (Figure 2). With rotate and zoom buttons, users can easily move camera or scale objects. By selecting one map to rotate while leaving others fixed in place, maps can be aligned within features. 3D Genome Tuner draws on a fixed size of screen and resolves very closely spaced features by zooming in and out. The number of polygons may affect its performance. However, there is no lag in the range of prokaryotic genome size (1). During the test, we found that even the largest bacterial chromosome longer than 8,000,000 bp can be loaded and displayed comfortably on a computer with 512 Mb of memory and a moderate 3D graphics card. 3D Genome Tuner gives user privilege to set the color of any item and the radius of any track on the map. Once changes have been made, the settings are stored in a properties file under the same folder as the program, which will be loaded at the next startup. The graph can also be exported to an image for publication or demonstration purpose.
Figure 2.
A screenshot showing comparison between two mitochondrion genomes from Homo sapiens (AC_000021) and Mus musculus domesticus (NC_006914). The tracks from the outside to inside represent: (1) coordinates; (2) CDSs; (3) homologous regions calculated by NUCmer; (4) GC-content; (5) tRNA and rRNA genes; (6) GC-skew [(G−C)/(G+C)]. Lines linking the upper and lower circular maps are BLAST alignments between the orthologous genes.
Data acquisition
3D Genome Tuner accepts sequences in EMBL, GenBank or FASTA format. Files can be accessed from a local disk or network. The program has recorded the URLs of all bacterial and other circular genomes on NCBI ftp server, so users only need to select a genome and download from the bookmarks. Unlimited number of sequences can be opened at one time and drawn in a stack. Once a sequence is loaded, three circles are generated by default: GC-content, GC-skew and the coordinates. Genes can either be recognized from GenBank or EMBL annotations, or by loading tab-delimited files containing their locations. Genomic regions specified by user or imported from MUMmer (2) or Mauve (3) outputs may also take up a circle. To show comparison between genes, BLAST (4) alignment result is required. 3D Genome Tuner summarizes redundant BLAST matches by choosing the one with longest overlap among these matches. Duplicated matches with the same score are also reserved. The orthologous genes are linked with different colors indicating the matching percentage. MUMmer or Mauve output describing homologous regions between genomes can be parsed and appended to maps in the same manner, except that regions are painted with colors based on their orders.
3D Genome Tuner allows user to choose different classification methods to render CDS (coding sequence) genes, which helps them better understand the genome organizations. By providing codes in Clusters of Orthologous Groups of proteins (COGs) (5) or Gene Ontology (GO) (6), genes with the same biological function are painted with the same colors. Other classification methods are based on the characteristics of the genes, such as length or GC-content. By dividing genes into several groups and using gradient colors in these groups, the distribution of genes can be manifested at a glance.
Exploration
3D Genome Tuner provides one navigation window and five sub-windows that allow user to explore detailed genomic features. The navigation window is a tree control that expands on every genome as well as every feature. The items in the tree control provide tool tips regarding the properties of the genome or genes, etc. Click an item representing a CDS, RNA or pseudo gene, then the locus tag of the gene will be labelled besides its location in the graph. If an orthologous gene pair is clicked, a line linking the location of each gene is shown. Below the main window, the first sub-window shows a linear image of current selected genome, in which the CDS and RNA genes are drawn in lines with arrowheads representing their strands. The second sub-window is a plot showing frame-specific statistics, including GC-content, GC-skew and gene coverage. The third sub-window is a sequence browser in which nucleotides will be shown when the item for corresponding gene is clicked in the navigation window. The fourth sub-window shows alignment between amino acids of the orthologous gene pair that user clicked. The alignment is performed by Jaligner (http://jaligner.sourceforge.net) at real time. Parameters for the alignment can be set in the option panel. The last sub-window shows a zoom-in picture of the main canvas.
3D Genome Tuner carries out simple analysis by highlighting genomic regions of interests concerning high variance of GC-content, length or count of genes and gene coverage. The analysis assumes that the values of every frame are in normal distribution. Values that fall out of 95% or 99% confidence interval are drawn in pies filled with corresponding colors on the map, aiding researchers in further investigations.
Conclusion
3D Genome Tuner shows the power of 3D graphics in genome studies. It is a novel tool that enables viewing multiple circular genomes in a 3D context by presenting information in distinct dimensions. The improvements include: (1) genomes can be compared in circular form; (2) alignments won’t be stretched too long. 3D Genome Tuner is a tool for preliminary genome studies and provides a possible way to resolve the difficulties in the comparison between circular genomes. Its ease-of-use not only makes the research more effective but also brings user a delightful experience in exploring with animation and sprite colors. The future version of 3D Genome Tuner will focus on the integration of other analysis tools and provide greater usability to users. XML annotations and microarray data support are both on schedule. The graphics will also be updated to provide more precise features and annotations. We hope 3D Genome Tuner will open a new perspective in genome visualizing researches and facilitate studies of comparative genomics.
Authors’ contributions
QW designed and implemented the program as well as wrote and revised the manuscript. QL participated in drafting the manuscript and testing the program. XZ supervised the study and revised the manuscript. All authors read and approved the final manuscript.
Competing interests
The authors have declared that no competing interests exist.
Acknowledgements
This project was supported by the National HighTech R&D Program (863 Program) of China (Grant No. 2006AA02A301) awarded to XZ.
References
- 1.Doolittle R.F. Biodiversity: microbial genomes multiply. Nature. 2002;416:697–700. doi: 10.1038/416697a. [DOI] [PubMed] [Google Scholar]
- 2.Delcher A.L. Alignment of whole genomes. Nucleic Acids Res. 1999;27:2369–2376. doi: 10.1093/nar/27.11.2369. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Darling A.C. Mauve: multiple alignment of conserved genomic sequence with rearrangements. Genome Res. 2004;14:1394–1403. doi: 10.1101/gr.2289704. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Altschul S.F. Basic local alignment search tool. J. Mol. Biol. 1990;215:403–410. doi: 10.1016/S0022-2836(05)80360-2. [DOI] [PubMed] [Google Scholar]
- 5.Tatusov R.L. A genomic perspective on protein families. Science. 1997;278:631–637. doi: 10.1126/science.278.5338.631. [DOI] [PubMed] [Google Scholar]
- 6.Ashburner M. Gene ontology: tool for the unification of biology. Nat. Genet. 2000;25:25–29. doi: 10.1038/75556. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Carver T.J. ACT: the Artemis Comparison Tool. Bioinformatics. 2005;21:3422–3423. doi: 10.1093/bioinformatics/bti553. [DOI] [PubMed] [Google Scholar]
- 8.Soh J. Bluejay 1.0: genome browsing and comparison with rich customization provision and dynamic resource linking. BMC Bioinformatics. 2008;9:450. doi: 10.1186/1471-2105-9-450. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Leader D.P. BugView: a browser for comparing genomes. Bioinformatics. 2004;20:129–130. doi: 10.1093/bioinformatics/btg383. [DOI] [PubMed] [Google Scholar]
- 10.Uchiyama I. CGAT: a comparative genome analysis tool for visualizing alignments in the analysis of complex evolutionary changes between closely related genomes. BMC Bioinformatics. 2006;7:472. doi: 10.1186/1471-2105-7-472. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Stothard P., Wishart D.S. Circular genome visualization and exploration using CGView. Bioinformatics. 2005;21:537–539. doi: 10.1093/bioinformatics/bti054. [DOI] [PubMed] [Google Scholar]
- 12.Engels R. Combo: a whole genome comparative browser. Bioinformatics. 2006;22:1782–1783. doi: 10.1093/bioinformatics/btl193. [DOI] [PubMed] [Google Scholar]
- 13.Carver T. DNAPlotter: circular and linear interactive genome visualization. Bioinformatics. 2009;25:119–120. doi: 10.1093/bioinformatics/btn578. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Fiers M.W. DNAVis: interactive visualization of comparative genome annotations. Bioinformatics. 2006;22:354–355. doi: 10.1093/bioinformatics/bti807. [DOI] [PubMed] [Google Scholar]
- 15.Nix D.A., Eisen M.E. GATA: a graphic alignment tool for comparative sequence analysis. BMC Bioinformatics. 2005;6:9. doi: 10.1186/1471-2105-6-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Vernikos G.S. GeneViTo: visualizing gene-product functional and structural features in genomic datasets. BMC Bioinformatics. 2003;4:53. doi: 10.1186/1471-2105-4-53. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Sato N., Ehira S. GenoMap, a circular genome data viewer. Bioinformatics. 2003;19:1583–1584. doi: 10.1093/bioinformatics/btg195. [DOI] [PubMed] [Google Scholar]
- 18.Baerends R.J. Genome2D: a visualization tool for the rapid analysis of bacterial transcriptome data. Genome Biol. 2004;5:R37. doi: 10.1186/gb-2004-5-5-r37. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Yang J. GenomeComp: a visualization tool for microbial genome comparison. J. Microbiol. Methods. 2003;54:423–426. doi: 10.1016/s0167-7012(03)00094-0. [DOI] [PubMed] [Google Scholar]
- 20.Ohtsubo Y. GenomeMatcher: a graphical user interface for DNA sequence comparison. BMC Bioinformatics. 2008;9:376. doi: 10.1186/1471-2105-9-376. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Gibson R., Smith D.R. Genome visualization made fast and simple. Bioinformatics. 2003;19:1449–1450. doi: 10.1093/bioinformatics/btg152. [DOI] [PubMed] [Google Scholar]
- 22.Ghai R. GenomeViz: visualizing microbial genomes. BMC Bioinformatics. 2004;5:198. doi: 10.1186/1471-2105-5-198. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Gans J.D., Wolinsky M. Genomorama: genome visualization and analysis. BMC Bioinformatics. 2007;8:204. doi: 10.1186/1471-2105-8-204. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Kerkhoven R. Visualization for genomics: the Microbial Genome Viewer. Bioinformatics. 2004;20:1812–1814. doi: 10.1093/bioinformatics/bth159. [DOI] [PubMed] [Google Scholar]
- 25.Kurtz S. Versatile and open software for comparing large genomes. Genome Biol. 2004;5:R12. doi: 10.1186/gb-2004-5-2-r12. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Hu Z. SeqVISTA: a graphical tool for sequence feature visualization and comparison. BMC Bioinformatics. 2003;4:1. doi: 10.1186/1471-2105-4-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Montgomery S.B. Sockeye: a 3D environment for comparative genomics. Genome Res. 2004;14:956–962. doi: 10.1101/gr.1890304. [DOI] [PMC free article] [PubMed] [Google Scholar]