Skip to main content
. 2015 Oct 26;44(Database issue):D917–D924. doi: 10.1093/nar/gkv1101

Table 3. Data set overview.

Data set Features Samples Normalisation method
Leukemia MILE study 67191 2095 1
Normal human hematopoiesis with AMLs 67191 296 1,7
Immgen Key populations 47273 256 2
AML versus normal 67191 252 3
AML TCGA data set 67191 244 1
AML TCGA data set versus normal 67191 244 3
AML Normal Karyotype 54675 234 1
AML Normal Karyotype versus normal 67191 234 3
Normal human hematopoiesis (DMAP) 35459 211 4
Immgen abT cells 47273 190 2
Immgen Dentritic cells 47273 151 2
Immgen MFs Monocytes Neutrophils 47273 114 2
Immgen B cells 47273 103 2
Normal human hematopoiesis (HemaExplorer) 57270 77 5
Immgen gdT cells 47273 76 2
Immgen Stem and progenitor cells 47273 76 2
Mouse normal hematopoietic system 57613 67 4
Immgen Activated T cells 47273 55 2
Immgen NK cells 47273 47 2
Immgen Stromal cells 47273 39 2
Mouse normal (RNA seq) 45426 52 6
BloodPool 67191 2120 1,7
BloodPool versus normal 67191 2076 3,7

Normalisation method legend:

1 Each cancer sample is normalised together with a set of samples from sorted normal myeloid populations. All samples where normalised using RMA. Comparison of gene expression values is not possible with other data sets in Bloodspot.

2 All samples from the ImmGen data sets were normalised together with RMA. Samples were subsequently attributed to the different data sets in BloodSpot. This means that comparison of gene expression values is possible across all ImmGen data sets.

3 The data are normalised according to Rapin et al. Briefly, each cancer sample is normalised together with a set of samples from sorted normal myeloid populations. Next, using a PCA-based method, the 5 closest normal samples from the cancer sample are averaged and this computed normal sample are next compared to the cancer sample allowing for computation of gen expression fold changes. See Supplementary Methods and Rapin et al. (10).

4 All sampleswhere

normalised using RMA. Comparison of gene expression values is not possible with other datasets in Bloodspot.

5

See our previous work (Bagger et al. (3)).

6 The data were processed using the bcbio nextgen RNA-seq pipeline. Count data were subsequently processed with DESeq2's variance stabilising transformation method.

7 The data was batch corrected using ComBat, taking study number as batch.