Skip to main content
. 2022 Jan 12;11:giab090. doi: 10.1093/gigascience/giab090

Table 2:

List of the functional inference tools, ecological trait assignment tools, and databases

Tool Implementation Targeted genes Functional prediction Approaches Methods Inputs used Strengths and Specificities Limitations
PanFP Perl (recently Python) 16S rRNA KO, Gene Ontology, Pfam, TIGRFAM Functional inference Builds a pangenome NCBI taxonomy
  • Uses functional profile of the pangenome so could be less sensitive to horizontal gene transfer

  • Evolutionary models are not taken into account

 
  • No confidence score generated

 
  • Not yet available for microbial eukaryotes

PAPRICA Python 16S/18S rRNA MetaCyc ontology Functional inference Phylogenetic placement Based on rDNA amplicon sequences
  • 18S rRNA amplicons are taken into account

  • Examples on the developer's blog

  • Errors may occur with sequence placement due to poor resolution of rRNA amplicons in some clades

PICRUSt Python 16S rRNA KO, KEGG Pathway, COG, CAZy Functional inference ASR (Wagner Parsimony, ACE ML, ACE REML, ACE PIC) Greengenes taxonomy (18may2012 or v13.5/v13.8)
  • Evolutionary models are taken into account

 
  • Confidence score generated (NSTI)

 
  • Correction of OTU copy numbers

  • Based on specific taxonomy (Greengenes identifiers)

 
  • KEGG database not updated since 2011

 
  • No pre-calculated table of fungal genomes available

PICRUSt2 Python/R 16S/18S rRNA/ITS MetaCyc, KO, EC number, COG, Pfam, TIGRFAM Functional inference HSP (maximum parsimony, empirical probabilities, subtree averaging, SCP) Based on rDNA amplicon sequences
  • Evolutionary models are taken into account

 
  • Confidence score generated (NSTI)

 
  • Twice as many KO scores

 
  • Multiple HSP methods can be implemented (takes branch length weighting into account)

 
  • 18S rRNA and ITS amplicons are taken into account

 
  • Extensive documentation and active community

  • Errors may occur with sequence placement owing to poor resolution of rRNA amplicons in some clades

Piphillin Web-based 16S rRNA BioCyc, KEGG Functional inference Nearest-neighbor matching of 16S rRNA gene amplicons with genomes from reference databases Based on rDNA amplicon sequences
  • Regular updates of functional databases

 
  • rRNA copy number adjustment

  • Available online only

 
  • Available for 16S rRNA only

SINAPS USEARCH 16S rRNA Trait annotation (e.g., energy metabolism, Gram-positive staining, presence of a flagellum) Functional inference Word counting Greengenes, SILVA
  • Confidence is estimated by boostrapping

 
  • Integrated to USEARCH tool

  • No peer-reviewed publication (bioRxiv preprint)

 
  • Detailed explanation is missing (e.g., how was protrait input created?)

Tax4Fun R package 16S rRNA KO Functional inference Nearest-neighbor search based on a minimum 16S rRNA sequence similarity SILVA taxonomy
  • Uses R (multiplatform) with pre-calculated files

 
  • Confidence score generated (FTU and FSU)

 
  • The algorithm could better predict poorly characterized taxa compared to approaches based on ASR with possible large distances in the tree, thanks to a minimum of similarity between sequences

  • Based on specific taxonomy (SILVA identifiers)

 
  • KEGG database not updated since 2011

Tax4Fun2 R package 16S rRNA KO Functional inference BLAST Based on rDNA amplicon sequences
  • Algorithm with a minimal sequence similarity

 
  • Uses R (multiplatform) with pre-calculated, highly memory-efficient platform-independent files

 
  • Confidence score generated (FTU and FSU)

 
  • KO update from 2018

 
  • Calculates the redundancy of specific functions directly

 
  • Builds its own habitat-specific reference

  • Not yet available for microbial eukaryotes

Vikodak Web-based (not longer available) 16S rRNA KEGG pathway, EC number Functional inference Microbial co-existence patterns RDP, SILVA
  • Pathway exclusion cut-off value is available to provide the minimum percentage of genes/enzymes belonging to a metabolic pathway required to consider the pathway as functional

 
  • Compares 2 datasets

  • Not longer available

 
  • Not yet available for microbial eukaryotes

iVikodak Web-based 16S rRNA KEGG, Pfam, COG, TIGRfam Functional inference Microbial co-inhabitance patterns RDP, Greengenes, SILVA
  • User-friendly for non-expert bioinformaticians

 
  • Integrated tools for statistical comparisons

 
  • Graphical visualizations

  • Available online only

 
  • Not yet available for microbial eukaryotes

FUNGuild Python/Web-based ITS Guild type Trait assignment Not applicable Based on UNITE taxonomy (ITS)
  • Trait quality for taxon assignment

  • No regular update

 
  • 18S rRNA taxonomy with related database not included. However, the database is open-access, and a homemade wrapper can be used for 18S metabarcoding output

FAPROTAX Python; flat file 16S rRNA Ecological functions (e.g., nitrification, denitrification, or fermentation) Trait assignment, Database If all type strains of a species at the genus level share the function, FAPROTAX assumes that all uncultured organisms of this genus possess the putative function SILVA (128, 132)
  • Based on the literature of cultured taxa

 
  • Availability of all literature to create the database

 
  • Functions easily added to the tool

  • Implicit assumption (see Methods column) could be false with the increase of newly cultured organisms

 
  • Does not infer upper rank when taxonomic resolution is poor

BacDive Python and R API, R package Morphology, physiology (API®-tests), molecular data, and cultivation conditions Database Not applicable NCBI taxonomy
  • Provides links to ENA, GenBank, SILVA, BRENDA, GBIF, ChEBI, Straininfo website data

 
  • A match with 16S rRNA sequences is available from SILVA

  • Does not provide a tool for metabarcoding output

BugBase R/Python 16S rRNA KEGG Functional inference PICRUSt, custom trait assignment Greengenes
  • Biogically interpretable traits (Gram staining, oxygen tolerance, biofilm formation, pathogenicity, mobile element content, and oxidative stress tolerance)

  • No peer-reviewed publication (bioRxiv preprint)

IJSEM Flat file with R script for curation IJSEM Database Not applicable Not applicable
  • 16S rRNA accession numbers available

  • Does not provide a tool for metabarcoding output

ProTraits Web-based; flat files Wikipedia, MicrobeWiki, HAMAP proteomes, PubMed abstracts and publications, Bacmap, Genoscope, JGI, KEGG, NCBI, Karyn's Genomes Database Not applicable Not applicable
  • Phenotypic inference

 
  • large ressource (∼545,000 phenotypes scanning 424 traits across 3,046 species)

 
  • NCBI taxonomy available

  • Does not provide a tool for metabarcoding output

BURRITO Web-based 16S rRNA KO Functional inference PICRUSt Greengenes
  • Explores simultaneous and integrative studies of taxonomic and functional profiles

  • Based on PICRUSt v1

MACADAM Python/web implementation 16S rRNA MetaCyc, MicroCyc, FAPROTAX, IJSEM Functional inference, Trait assignment Custom methods (provides functional information about upper-rank taxa when organism name is not found) NCBI taxonomy
  • Pathway score and pathway frequency score are provided, allowing knowledge of number of enzymes present in the pathway

  • Not yet available for microbial eukaryotes

FunFun R package; flat file Ecological traits Trait assignment Not applicable Based on UNITE taxonomy (ITS)
  • Uses R (multiplatform)

 
  • Complementary to FUNGuild

FungalTraits Flat files Guild type, body type, habitat Trait assignment Not applicable Based on UNITE taxonomy (ITS)
  • Expert work to propose traits at the genus level

 
  • Merges the FUNGuild and FunFun tools

 
  • An excel file with vlookup function is available to assign guilds or trait data

  • Does not provide a tool for metabarcoding output

DEEMY Web-based Morphology, anatomy, potential for chemical reactions, or even ecology traits Database Not applicable Not applicable
  • Link to tree species associated

 
  • Includes images

  • Specialized in ectomycorrhizas only

Bacteria-archaea-traits R package; flat file 16S rRNA Traits, phenotypic traits, quantitative genomic traits Database Not applicable NCBI taxonomy, GTDB taxonomy
  • Groups the major bacterial and archaeal databases into 1 database

 
  • Traits and species data condensed

 
  • R workflow available to retrieve condensed trait and species data

OntoBiotope Web-based Habitats and phenotypes Database ToMap (Text to ontology mapping) NCBI taxonomy
  • Term relevance is evaluated by the semantic search engine PubMedBiotope

 
  • Maintained by ∼30 microbiology experts

  • Dedicated to the food domain

@Minter Python Microbial interactions Machine learning Support-vector machine (SVM)-based classifier No specific taxonomy, just species level
  • Original approach to get information on microbial interactions rapidly

  • Species name required