Table 2.
Summary of the FAIR Cookbook suggestions for required metadata (modified after chapter “11.5.1 Metadata profile for transcriptomics” of the current FAIR Cookbook (September 2023) [37].
Metadata field | Definition | Comment | Metadata type |
---|---|---|---|
unique ID | Identifier for a sample that is at least unique within the project | Common metadata, Assay metadata | |
sample type | The type of the collected specimen, e.g., tissue biopsy, blood draw or throat swab | ontology field - e.g. OBI or EFO | Common metadata |
species | The primary species of the specimen, preferably the taxonomic identifier | This may not be the same as the “host” organism, eg in the case of a PDX tissue sample, the host may be a mouse but the tissue may be human. Ontology field - NCBITaxonomy | Common metadata |
tissue/organism part | The tissue from which the sample was taken | ontology field - e.g. Uberon | Common metadata |
sex | The biological/genetic sex of the sample | ontology field - e.g. PATO | Common metadata |
development stage | The developmental stage of the sample | ontology field - e.g. Uberon or Hsadpdv; species dependent | Common metadata |
disease | Any diseases that may affect the sample | This may not necessarily be the same as the host’s disease, e.g. healthy brain tissue might be collected from a host with type II diabetes while cirrhotic liver tissue might be collected from an otherwise healthy individual. Ontology field - e.g. MONDO or DO | Common metadata |
experiment type | The type of experiment performed, e.g., ATAC-seq or seqFISH | ontology field - e.g. EFO or OBI | Assay metadata |
analysis type | The type of analysis performed, e.g., genome assembly or variant calling | ontology field - e.g. EFO, OBI or EDAM | Analysis metadata |
platform | The type of instrument used to perform the assay, e.g., Illumina HiSeq 4000 or Fluidigm C1 microfluidics platform | ontology field - e.g. EFO or OBI | Assay metadata |
instrument model | The specific instrument on which the assay was performed. Essential for QC purposes. | ontology field - e.g. EFO or OBI | Assay metadata |
array or sequencing method | The array or sequencing technology used - may be the same as experiment type or can be a more specific term | ontology field - e.g. EFO or OBI | Assay metadata |
extracted nucleic acid/material type | The type of material that was extracted from the sample, e.g., polyA RNA | ontology field - e.g. ChEBI or EFO | Assay metadata |
nucleic acid extraction method | Technique used to extract the nucleic acid from the cell | ontology field - e.g. EFO or OBI | Assay metadata |
cDNA library amplication method | Technique used to amplify a cDNA library | ontology field - e.g. EFO or OBI | Assay metadata |
end bias | The type of tag or end bias the library has, e.g., 3 prime tag or 5 prime end bias | standardised field or ontology | Assay metadata |
biological or technical replicate | Information whether the sample on which the assay was performed was biological or technical replicate. | boolean or CV | Assay metadata |
computational method | The specific computational method or algorithm used as part of the analysis | ontology field - e.g. EFO or EDAM | Analysis metadata |
normalisation strategy | The approach used to normalise the data | ontology field - e.g. EFO or EDAM | Analysis metadata |
file format | The file format in which the analysis is provided | ontology field - e.g. EDAM | Analysis metadata |
file storage location | The location in which the data files are stored | Analysis metadata | |
collection date | The date on which the sample was collected, in a standardised format | Collection date in combination with other fields such as location and disease may be sufficient to de-anonymise a sample | Common metadata |