GEDAS ‐ Gene Expression Data Analysis Suite

Tangirala Venkateswara Prasad; Ravindra Pentela Babu; Syed Ismail Ahson

doi:10.6026/97320630001083

. 2006 Jan 26;1(3):83–85. doi: 10.6026/97320630001083

GEDAS ‐ Gene Expression Data Analysis Suite

Tangirala Venkateswara Prasad ^1,^*, Ravindra Pentela Babu ², Syed Ismail Ahson ³

PMCID: PMC1891661 PMID: 17597861

Abstract

Currently available micro-array gene expression data analysis tools lack standardization at various levels. We developed GEDAS (gene expression data analysis suite) to bring various tools and techniques in one system. It also provides a number of other features such as a large collection of distance measures and pre-processing techniques. The software is an extension of Cluster 3.0 (developed based on Eisen Lab's Cluster and Tree View software). GEDAS allows the usage of different datasets with algorithms such as k-means, HC, SVD/PCA and SVM, in addition to Kohonen's SOM and LVQ.

Availability

http://gedas.bizhat.com/gedas.htm

Keywords: gene expression, standardization, GEDAS, cluster, software

Background

This work attempts to integrate different tools and techniques for gene expression analysis with an aim to standardize them for efficient usage. In this context, a number of tools such as Cluster/ Tree View [1 ], SNOMAD [2], Cluster 3.0 software [3], GEDA suite [4], GEPAS [5], J-Express [6], Cleaver 1.0 [7] and Expression Profiler [8 ] have been extensively studied and significantly improved in recent years. Here, we describe a software called GEDAS (gene expression data analysis suite) developed by integrating techniques such as OM, LVQ, k-means, hierarchical clustering, SVM [9] and PCA. The software supports a number of visualization techniques/gene expression data preprocessing algorithms [1– 4 ] and it contains over 10 visualizations and 19 distance measures.

Methodology

The GEDAS software has been developed as stand-alone software for analysis of microarray gene expression data using Visual Basic and Visual C++ programming languages. Microarray datasets can be loaded in plain text file, MS Excel or MS Access formats. The software uses Crystal Reports for generating outputs. A snapshot of GEDAS is shown in Figure 1.

Utility

The software facilitates various levels of data manipulation during pre-processing. GEDAS generates at least 6 different outputs for any analysis unlike other many tools producing just one output. The whole genome visualization tool is introduced in this development. [10] In addition to the traditional plots/graphs such as scatter plot and histograms, the temporal (or wave) graph, tree view, tree map, and whole genome view were standardized, developed and integrated into the software. We evaluated the tools using breast cancer, mouse (Mus musculus), Arabidopsis thaliana, Homo sapiens and sugarcane datasets. Another most important inclusion was the representation of hierarchical clustering output in the form of temporal (or wave) graph. In GEDAS, results are presented in a number of ways described elsewhere [4–11–12–13–14– 15–16]. The techniques implemented in GEDAS are given in Table 1. The software facilitates sorting of data in rows, columns or both. The output can be exported in PDF, BMP, GIF and JIF formats.

Table 1. The application of various visualization techniques included in GEDAS is listed.

Visualization/Algorithm	Raw data	Pre-processed data	SOM	K-Means	LVQ	HC	PCA (gene)	SVM
Histogram	✓	✓					✓
Checks view	✓	✓	✓	✓	✓	✓	✓	✓
Microarray	✓	✓	✓	✓	✓	✓	✓	✓
Whole sample	✓	✓	✓	✓	✓	✓	✓	✓
Proximity map	✓	✓	✓	✓	✓	✓	✓	✓
Temporal(incl. zoomed cluster view)			✓	✓	✓	✓	✓	✓
Texual			✓	✓	✓	✓	✓	✓
PC view							✓
Eigen graph							✓
Tree view						✓
Scatter plot & M vs. A plot	✓	✓					✓
Box-Whisker plot	✓	✓
Gene Ontology			✓	✓	✓	✓	✓	✓

Open in a new tab

Future work

In future development, we plan to incorporate other visualization tools [4 –17] including 2D and 3D score plots, profile plots, scatter plots (3D scatter plots, PCA visualization, ISOMAP visualization, and multi-dimensional scaling), Venn diagrams for visualizing similar elements in micro-arrays and SOM visualization for clustering result. We also plan to provide the software using a web interface. Our other plans include addition of robust distance measures and data mining tools (fuzzy c-means and agglomerative).

Acknowledgments

The software mentioned are either trademarks or registered trademarks of respective individuals or corporation and are therefore acknowledged.

Footnotes

Citation:Prasad et al., Bioinformation 1(3): 83-85, (2006)

References

1. http://rana.lbl.gov/EisenSoftware.htm.
2. http://pevsnerlab.kennedykrieger.org/snomad.htm.
3. http://bonsai.ims.u-tokyo.ac.jp/~mdehoon/software/cluster.
4. http://bioinformatics.upmc.edu/GE2/GEDA.html.
5. http://gepas.bioinfo.cnio.es.
6. http://www.molmine.com.
7. http://classify.stanford.edu/
8. http://www.ebi.ac.uk/expressionprofiler.
9. http://www.csie.ntu.edu.tw/~cjlin/libsvm.
10.Caron H, et al. Science. 2001;291:1289. doi: 10.1126/science.1056794. [DOI] [PubMed] [Google Scholar]
11.Chen CH. Statistica Sinica. 2002;12:7. Link. [Google Scholar]
12.Luo et F, et al. BIBE. 2003;328 Link. [Google Scholar]
13.Tavazoie S, et al. Nature Genetics. 1999;22:281. doi: 10.1038/10343. [DOI] [PubMed] [Google Scholar]
14.Toronen P, et al. FEBS Letters. 1999;451:142. doi: 10.1016/s0014-5793(99)00524-4. Link. [DOI] [PubMed] [Google Scholar]
15. http://bioinfo.cnio.es/docus/SOTA/#Software.
16. http://cs.hut.fi.
17. http://www.silicocyte.com.

[R01] 1. http://rana.lbl.gov/EisenSoftware.htm.

[R02] 2. http://pevsnerlab.kennedykrieger.org/snomad.htm.

[R03] 3. http://bonsai.ims.u-tokyo.ac.jp/~mdehoon/software/cluster.

[R04] 4. http://bioinformatics.upmc.edu/GE2/GEDA.html.

[R05] 5. http://gepas.bioinfo.cnio.es.

[R06] 6. http://www.molmine.com.

[R07] 7. http://classify.stanford.edu/

[R08] 8. http://www.ebi.ac.uk/expressionprofiler.

[R09] 9. http://www.csie.ntu.edu.tw/~cjlin/libsvm.

[R10] 10.Caron H, et al. Science. 2001;291:1289. doi: 10.1126/science.1056794. [DOI] [PubMed] [Google Scholar]

[R11] 11.Chen CH. Statistica Sinica. 2002;12:7. Link. [Google Scholar]

[R12] 12.Luo et F, et al. BIBE. 2003;328 Link. [Google Scholar]

[R13] 13.Tavazoie S, et al. Nature Genetics. 1999;22:281. doi: 10.1038/10343. [DOI] [PubMed] [Google Scholar]

[R14] 14.Toronen P, et al. FEBS Letters. 1999;451:142. doi: 10.1016/s0014-5793(99)00524-4. Link. [DOI] [PubMed] [Google Scholar]

[R15] 15. http://bioinfo.cnio.es/docus/SOTA/#Software.

[R16] 16. http://cs.hut.fi.

[R17] 17. http://www.silicocyte.com.

PERMALINK

GEDAS ‐ Gene Expression Data Analysis Suite

Tangirala Venkateswara Prasad

Ravindra Pentela Babu

Syed Ismail Ahson

Abstract

Availability

Background

Methodology

Figure 1.

Utility

Table 1. The application of various visualization techniques included in GEDAS is listed.

Future work

Acknowledgments

Footnotes

References

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

GEDAS ‐ Gene Expression Data Analysis Suite

Tangirala Venkateswara Prasad

Ravindra Pentela Babu

Syed Ismail Ahson

Abstract

Availability

Background

Methodology

Figure 1.

Utility

Table 1. The application of various visualization techniques included in GEDAS is listed.

Future work

Acknowledgments

Footnotes

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases