Skip to main content
. 2015 Sep 8;112(38):11941–11946. doi: 10.1073/pnas.1514285112

Fig. S2.

Fig. S2.

Fig. S2.

Cross-assembly strategy for data analysis. (A) For a given sample, a first round of stringent assembly yielded contigs that were then extended in a progressive (iterative) fashion. Contigs were subsequently pooled for all VLP samples within each human family and further extended. (B) The cross-assembly strategy yielded 17,676 contigs [lower cutoff, 500 bp; largest, 228,572 bp; contig coverage, 23 ± 2 fold (mean ± SEM); 85 ± 9% (mean ± SD) of the reads used per sample; n = 231 samples plus six technical replicates from two twin pairs)]. Color code: red, contigs whose termini overlapped suggesting circular and potentially complete viral genomes; blue, linear contigs (termini are nonidentical). For contigs with a greater than 10-fold coverage, circular contigs were mainly observed in the size ranges of 3–4 Kb (typical size of ssDNA Anelloviridae genomes), ∼6–7 Kb (typical size of ssDNA Microviridae genomes), and >30 Kb (typical size of dsDNA phage genomes). (C) Taxonomic assignments were made for 44.14% of assembled contigs; 16.3% corresponded to eukaryotic viruses, mainly Anelloviridae, with the reminder consisting of phages, primarily dsDNA Caudovirales or the corresponding families Siphoviridae, Myoviridae, and Podoviridae. Contigs assigned to the Circoviridae and Anelloviridae were selected and searched for either the ORF encoding Rep (Circoviridae) or the product of ORF1 (Anelloviridae); only contigs with complete or almost complete sequences for these genes were analyzed. Reference proteins were downloaded from NCBI (Table S6) and used to identify clusters of viral families (colored shades). Neighbor Joining trees were built for Circoviridae (D) or Anelloviridae (E). Contigs that appeared as discriminatory for Family, Health Status, or Village of Origin are highlighted in different colors. Numbers correspond to locations where a given reference sequence falls in the tree. Reference sequence ID and accession numbers can be found in Table S6.