. 2015 Oct 26;44(Database issue):D917–D924. doi: 10.1093/nar/gkv1101

Table 3. Data set overview.

Data set	Features	Samples	Normalisation method
Leukemia MILE study	67191	2095	1
Normal human hematopoiesis with AMLs	67191	296	1,7
Immgen Key populations	47273	256	2
AML versus normal	67191	252	3
AML TCGA data set	67191	244	1
AML TCGA data set versus normal	67191	244	3
AML Normal Karyotype	54675	234	1
AML Normal Karyotype versus normal	67191	234	3
Normal human hematopoiesis (DMAP)	35459	211	4
Immgen abT cells	47273	190	2
Immgen Dentritic cells	47273	151	2
Immgen MFs Monocytes Neutrophils	47273	114	2
Immgen B cells	47273	103	2
Normal human hematopoiesis (HemaExplorer)	57270	77	5
Immgen gdT cells	47273	76	2
Immgen Stem and progenitor cells	47273	76	2
Mouse normal hematopoietic system	57613	67	4
Immgen Activated T cells	47273	55	2
Immgen NK cells	47273	47	2
Immgen Stromal cells	47273	39	2
Mouse normal (RNA seq)	45426	52	6
BloodPool	67191	2120	1,7
BloodPool versus normal	67191	2076	3,7

Normalisation method legend:

1 Each cancer sample is normalised together with a set of samples from sorted normal myeloid populations. All samples where normalised using RMA. Comparison of gene expression values is not possible with other data sets in Bloodspot.

2 All samples from the ImmGen data sets were normalised together with RMA. Samples were subsequently attributed to the different data sets in BloodSpot. This means that comparison of gene expression values is possible across all ImmGen data sets.

3 The data are normalised according to Rapin et al. Briefly, each cancer sample is normalised together with a set of samples from sorted normal myeloid populations. Next, using a PCA-based method, the 5 closest normal samples from the cancer sample are averaged and this computed normal sample are next compared to the cancer sample allowing for computation of gen expression fold changes. See Supplementary Methods and Rapin et al. (10).

4 All sampleswhere

normalised using RMA. Comparison of gene expression values is not possible with other datasets in Bloodspot.

See our previous work (Bagger et al. (3)).

6 The data were processed using the bcbio nextgen RNA-seq pipeline. Count data were subsequently processed with DESeq2's variance stabilising transformation method.

7 The data was batch corrected using ComBat, taking study number as batch.