Table 1.
Library Name | Release Date |
Programming Language |
License | Website | Features |
---|---|---|---|---|---|
EMBOSS [43] | 2000 | C C++ BTL others |
GNU GPL | http://emboss.sourceforge.net/ | Sequence alignment; rapid database search; protein motif identification; nucleotide sequence pattern analysis; codon usage analysis for small genomes; rapid identification of sequence patterns in large scale sequence sets; presentation tools for publication. |
BTL [41] | 2001 | C++ | GNU GPL | http://www.cryst.bbk.ac.uk/~classlib/ | Data structures (e.g. graphs); nucleotide string methods (e.g. Fourier transform, Needleman-Wunsch alignment). |
Bioperl [47] | 2002 | Perl | Artistic License GNU GPL |
http://bioperl.org/ | Access sequence data from local/remote data bases; manage data base formats; data base search; manipulating sequences/sequence alignments; gene annotations. |
Bioconductor [50] | 2003 | R (C/C++) |
Artistic BSD GNU GPL |
https://www.bioconductor.org/ | Repository of multiple libraries for analysis and comprehension of genomic and –omics data, including NGS. |
BioPHP | 2003 | PHP | GNU GPL | http://biophp.org/ | DNA and protein sequence analysis, sequence alignment. |
GenomeTools [58] | 2003 | C | Open BSD | http://genometools.org/ | Parsing, compression, k-mer, suffix trees, annotation, error correction and other sequence analytics (FASTA, FASTQ) |
Pizza&Chili [94] | 2005 | C/C++ | GNU Lesser GPL | http://pizzachili.di.unipi.it/ | Compressed indices, text collections |
Bio++[42] | 2006 | C++ | CeCILL GPL | http://kimura.univ-montp2.fr/BioPP | Sequence analysis, phylogenetics, molecular evolution; population genetics. |
Biojava [46] | 2008 | Java | GNU Lesser GPL | www.biojava.org/ | Manipulate biological sequences; file parse; DAS client/server support; access to BioSQL/Ensembl data bases; tools for making sequence analysis GUIs; statistical routines; dynamic programming toolkit. |
SeqAn [52] | 2008 | C++ | BSD 3-clause | http://www.seqan.de/ | Extensive set of algorithms and data structures for the analysis of nucleotide sequences, with emphasis on NGS data; includes index, compression, data base search, support for NGS-specific file formats (fastq, SAM/BAM, VCF, BED). |
Biopython [45] | 2009 | Python, C | Biopython | http://biopython.org/ | Sequence input/output; alignment input/output; population genetics; structural bioinformatics; SQL interface. |
htslib SAMtools BCFtools [37] |
2009 | C | MIT Expat Modified BSD |
http://www.htslib.org/ | Read, write, edit, index, view SAM/BAM/CRAM formats; read, write BCF2/VCF/gVCF files; call, filter, summarize SNP/short indels. |
BioRuby [44] | 2010 | Ruby | GNU GPL | http://bioruby.open-bio.org/ | DNA and protein sequence analysis, sequence alignment, biological database parsing, ontology, structural biology. |
BAMTools [36] | 2011 | C++ | MIT | https://github.com/pezmaster31/bamtools | Read, write, manipulate BAM formats |
libStatGen [40] | 2011 | C++ | GNU GPL | https://github.com/statgen/libStatGen | Handle SAM/BAM, fastq, GLF, VCF, ASP. |
NGS++ [38] | 2013 | C++ | GNU Lesser GPL | https://github.com/NGS-lib/NGSplusplus | Read, write, manipulate multiple genomic file formats and data associated with BED type files (epigenomics). |
Bioclojure [39] | 2014 | Clojure | GNU Lesser GPL | https://github.com/s312569/clj-biosequence | Parse of Genbank, Uniprot XML, fasta, fastq formats; wrappers for BLAST, signalP, TMHMM; index files for random access, lazy processing of sequences from very large files. |