Table 3.
Summary of CNV calling methods for scRNA-seq data
| Method | Brief Explanation | Input | Resolution | Advantages | Disadvantage |
|---|---|---|---|---|---|
| InferCNV | Hidden Markov model: i3 and i6 model + Bayesian analysis. The i3 model: deletion, neutral and amplification states. The i6 model: complete loss, loss of one copy, neutral, addition of one copy, addition of two copies, and more than three copies. | Expression profiling | Identification of large-scale chromosome-scale CNVs |
1) InferCNV can work both with and without normal-cell reference; 2) it provides two analysis modes including predefined cell types as whole samples, or subclusters based on CNV patterns; 3) InferCNV provides an interactive R Shiny Web App |
InferCNV assumes the copy number dosage is constant over the whole predicted region. |
| HoneyBADGER | Hidden Markov model and Bayesian approach | Allelic imbalance and normalized expression profiling | Robust identification of sub-clonal focal alterations as small as 10 Mb; identification of CNVs at chromosome-arm-level with frequency as low as 30% of target cells, and at the full chromosome-level. |
1) Identifcation of CNVs as small as 10 Mb, much higher compared with average expression-based methods; 2) Detection of detect copy-number neutral loss-of-heterozygosity events. |
1) Use of WES or common natural SNP information from other public datasets as reference to generate heterozygous SNP positions; 2) Instead of estimating precise copy number, it aims at distinguishing copy number alteration regions from copy number neutral regions. |
| CONICS | Comparison of control distribution and observed distribution at each CNVR region in each cell. | Expression profiling | CNV regions inferred from other DNA sequencing data or the chromosome-arm level. | CONICS provides routines for further differential-expression, phylogeny, and co-expression network analysis. |
1) Predefined CNV locations in orthogonal DNA sequencing data such as WES. 2) Incapable of identifying novel CNV regions. |
| CONICSmat | Bayesian approach: chi-squared likelihood-ratio test by comparing 2-component Gaussian mixture model and 1-component Gaussian model. | Expression profiling | chromosomal-arm-level |
1) No need of an explicit normal control dataset, or DNA-sequencing data; 2) Providing routines for further differential-expression, phylogeny, and co-expression network analyses. |
1) Identification of CNVs at the mega base scale. 2) Incapable of identifying gene-level CNVs. |
| CaSpER | Hidden Markov model and Bayesian approach | Allele frequency shift+ expression profiling | large-scale gene-based, and segment-based CNV calls |
1) Variant calling is not needed and this can speed up the whole detection process; 2) CaSpER provides a number of downstream analyses: infer clonal evolution, discover mutual-exclusive and co-occurring CNV events, identify gene expression signature of the identified clones. |
1) The true positive rate only reaches 60–80%. 2) The detection accuracy for deletion is much higher than amplification. |