Figure - PMC

Skip to main content

An official website of the United States government

Here's how you know

Here's how you know

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

View full-text article in PMC

. 2020 Jan 3;9(1):giz148. doi: 10.1093/gigascience/giz148

Search in PMC
Search in PubMed
View in NLM Catalog
Add to search

© The Author(s) 2020. Published by Oxford University Press.

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.

PMC Copyright notice

Figure 2: — Data and metadata components required for a successful re-analysis. (A) Raw sequencing data are usually labeled with an SRR Run ID or other processing ID. Raw data rarely contain information about the processing steps applied, and the research parasite must use other information to determine what processing needs to be done. (B) Technical metadata connect file names to the respective sample ID and may also have other technical information such as primer and barcode sequences. In cases in which the raw sequencing data have not been separated into sample-wise files, barcodes are required to map sequences to samples. These are often the most difficult data to find. (C) Finally, study-related metadata are required to re-analyze samples. Metadata directly related to the analysis question are always necessary (i.e., disease status), but other metadata such as subject ID, sampling day, and replicate may also be required to ensure that proper statistical comparisons are being made.