Table 1.
Independent and informative covariates used in case studies
Case study | Covariates found to be independent and informative |
---|---|
Microbiome | Ubiquity: the proportion of samples in which the feature is present. In microbiome data, it is common for many features to go undetected in many samples. |
Mean nonzero abundance: the average abundance of a feature among those samples in which it was detected. We note that this did not seem as informative as ubiquity in our case studies. | |
GWAS | Minor allele frequency: the proportion of the population which exhibits the less common allele (ranges from 0 to 0.5) represents the rarity of a particular variant. |
Sample size (for meta-analyses): the number of samples for which the particular variant was measured. | |
Gene set analyses | Gene set size: the number of genes included in the particular set. Note that this is not independent under the null for over-representation tests, however (see Additional file 1: Supplementary Results). |
Bulk RNA-seq | Mean gene expression: the average expression level (calculated from normalized read counts) for a particular gene. |
Single-Cell RNA-seq | Mean nonzero gene expression: the average expression level (calculated from normalized read counts) for a particular gene, excluding zero counts. |
Detection rate: the proportion of samples in which the gene is detected. In single-cell RNA-seq it is common for many genes to go undetected in many samples. | |
ChIP-seq | Mean read depth: the average coverage (calculated from normalized read counts) for the region |
Window Size: the length of the region |