Skip to main content
. 2018 Nov 30;19(Suppl 15):443. doi: 10.1186/s12859-018-2423-8

Fig. 2.

Fig. 2

Conceptual and implemented architecture of the GLOSSary database. a Entity-Relationship diagram of the GLOSSary data. The Tara sequences and metadata are organised in runs (TARA RUN), each belonging to a specific station (TARA STATION). TARA MiTAGs are short sequences obtained by merging the paired end reads matching on the Silva 16S database, grouped by station, depth and fraction. The GLOSSary 16SContigs are longer sequences obtained by re-assemblying the MiTAGs with the glossary pipeline, and mapping them to the Silva database. b The GLOSSary Mongo-db document organization. The data presented in panel A have been denormalised and reorganised in two main documents: PLACE (B1) and SEQUENCE (B2). The PLACE document holds all the information on the Tara stations and associated runs and metadata. The SEQUENCE document holds the GLOSSary 16S contigs and the unassamblied MiTAGS nucleotidic sequences, together with the associated metadata and, for the 16SContigs, the taxonomical information and assembled MiTAGS identifiers