Table 2.
Summary of the FAIR Cookbook suggestions for required metadata (modified after chapter “11.5.1 Metadata profile for transcriptomics” of the current FAIR Cookbook (September 2023) [37].
| Metadata field | Definition | Comment | Metadata type |
|---|---|---|---|
| unique ID | Identifier for a sample that is at least unique within the project | Common metadata, Assay metadata | |
| sample type | The type of the collected specimen, e.g., tissue biopsy, blood draw or throat swab | ontology field - e.g. OBI or EFO | Common metadata |
| species | The primary species of the specimen, preferably the taxonomic identifier | This may not be the same as the “host” organism, eg in the case of a PDX tissue sample, the host may be a mouse but the tissue may be human. Ontology field - NCBITaxonomy | Common metadata |
| tissue/organism part | The tissue from which the sample was taken | ontology field - e.g. Uberon | Common metadata |
| sex | The biological/genetic sex of the sample | ontology field - e.g. PATO | Common metadata |
| development stage | The developmental stage of the sample | ontology field - e.g. Uberon or Hsadpdv; species dependent | Common metadata |
| disease | Any diseases that may affect the sample | This may not necessarily be the same as the host’s disease, e.g. healthy brain tissue might be collected from a host with type II diabetes while cirrhotic liver tissue might be collected from an otherwise healthy individual. Ontology field - e.g. MONDO or DO | Common metadata |
| experiment type | The type of experiment performed, e.g., ATAC-seq or seqFISH | ontology field - e.g. EFO or OBI | Assay metadata |
| analysis type | The type of analysis performed, e.g., genome assembly or variant calling | ontology field - e.g. EFO, OBI or EDAM | Analysis metadata |
| platform | The type of instrument used to perform the assay, e.g., Illumina HiSeq 4000 or Fluidigm C1 microfluidics platform | ontology field - e.g. EFO or OBI | Assay metadata |
| instrument model | The specific instrument on which the assay was performed. Essential for QC purposes. | ontology field - e.g. EFO or OBI | Assay metadata |
| array or sequencing method | The array or sequencing technology used - may be the same as experiment type or can be a more specific term | ontology field - e.g. EFO or OBI | Assay metadata |
| extracted nucleic acid/material type | The type of material that was extracted from the sample, e.g., polyA RNA | ontology field - e.g. ChEBI or EFO | Assay metadata |
| nucleic acid extraction method | Technique used to extract the nucleic acid from the cell | ontology field - e.g. EFO or OBI | Assay metadata |
| cDNA library amplication method | Technique used to amplify a cDNA library | ontology field - e.g. EFO or OBI | Assay metadata |
| end bias | The type of tag or end bias the library has, e.g., 3 prime tag or 5 prime end bias | standardised field or ontology | Assay metadata |
| biological or technical replicate | Information whether the sample on which the assay was performed was biological or technical replicate. | boolean or CV | Assay metadata |
| computational method | The specific computational method or algorithm used as part of the analysis | ontology field - e.g. EFO or EDAM | Analysis metadata |
| normalisation strategy | The approach used to normalise the data | ontology field - e.g. EFO or EDAM | Analysis metadata |
| file format | The file format in which the analysis is provided | ontology field - e.g. EDAM | Analysis metadata |
| file storage location | The location in which the data files are stored | Analysis metadata | |
| collection date | The date on which the sample was collected, in a standardised format | Collection date in combination with other fields such as location and disease may be sufficient to de-anonymise a sample | Common metadata |