Table 1.
Method | Ref. | Input | Output | Clinical data? | Availability | |||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Prop. | Expr. | Individual | Cancer | Normal blood | Other | R | CellMix | MATLAB | Other | |||
profile | ||||||||||||
ISOpure (Quon) | [33] | tumour & unmatched normal | ||||||||||
DeMix (Ahn) | [32] | tumour & unmatched normal | ||||||||||
Clarke | [30] | paired mixed & pure profiles | ||||||||||
Gosink | [31] | mixed profiles and known profile of one constituent | ||||||||||
DeconRNASeq (Gong) | [18] | profiles of constituents | ||||||||||
Gong | [19] | cell-type specific gene signatures | ||||||||||
Abbas | [20] | cell-type specific gene signatures | ||||||||||
Wang M. | [21] | cell-type specific gene signatures | ||||||||||
Lu | [22] | cell-type specific gene signatures | * | |||||||||
PERT (Qiao) | [46] | reference profiles of constituents | † | † | ||||||||
ESTIMATE (Yoshihara) | [47] | prior data used to derive cell-type specific gene signatures | ||||||||||
DSection (Erkkilä) | [12] | prior knowledge of proportions | † | |||||||||
csSAM (Shen-Orr) | [13] | proportions of constituents | ||||||||||
Bar-Joseph | [14] | proportions of consitutents, one expression profile | ||||||||||
Ghosh | [16] | proportions, tumour & unmatched normal | * | |||||||||
Stuart | [17] | proportions of constitutents | ||||||||||
TEMT (Li) | [48] | prior knowledge of proportions, paired mixed-pure profiles | ||||||||||
DSA (Zhong) | [23] | cell markers | ||||||||||
ssNMF (Gaujoux) | [25] | cell markers | ||||||||||
PSEA (Kuhn) | [24] | cell markers | ||||||||||
deconf (Repsilber) | [26] | cell markers | ||||||||||
Tolliver | [49] | tumour profile, number of constituents | ||||||||||
Roy | [50] | prior estimate of number of constituents | ||||||||||
Lähdesmäki | [15] | mixed expression profiles | † | |||||||||
Venet | [27] | mixed expression profiles, number of constituents | ||||||||||
UNDO (Wang N.) | [51] | mixed expression profiles |
Most of the algorithms are applied to microarray mRNA abundance data, although TEMP and ESTIMATE use high-throughput RNA-Seq data and ISOpure and DeconRNASeq can be applied to both [52]. The possible outputs of the algorithms are proportions of constituent cell-types (Prop.), average expression profiles (Expr.), or patient-specific expression profiles (Individual Profile) of constituent cell-types. The two main sources of clinical data were cancer-related gene expression data (including human Hodgkin’s lymphomas) or normal blood expression data. PSEA was applied to expression data from patients with Huntington’s disease, and Bar-Joseph also studied cell cycle synchronized foreskin fibroblast cells. In terms of availability, the summary package CellMix [28] is also an R package but is listed as a separate category. The only algorithms not available for either R or MATLAB are PERT (Octave) and TEMT (Python). Algorithms which were described as using built-in MATLAB or R functions were not included, as reproducible example code is not available for them. The currently available source code is summarized in Additional file 2.
Notes:
†Prior information about proportions or expressions is needed, but these values are re-estimated during the execution of the algorithm. For PERT, the individual profiles are adjusted (perturbed) versions of the reference profiles.
*The original code for Lu (Java-based) [22] and Ghosh [16] is no longer available.