Table 1.
PCA and data quality evaluation (simulation)
Table 2a. PCA-correction | ||||
Designed feature | GC-percentage | Date and scanner | ||
Identified Component | 1st | 2nd | ||
P-value | <1E-23 | <1E-23 | ||
Table 2b. Evaluation of data quality | ||||
Data Quality | High noise | Low noise | ||
σLRR | Nsub_ex | σLRR | Nsub_ex | |
Uncorrected | 0.30±0.03 | 76 | 0.25±0.03 | 10 |
Corrected (Comp. 1) | 0.28±0.02 | 46 | 0.23±0.02 | 4 |
Corrected (Comp. 1,2) | 0.28±0.02 | 40 | 0.22±0.02 | 4 |
Table 2c. Detection Accuracy: PCA-correction | ||||
Total generated markers with CNVs: 75867 | ||||
PennCNV results | Overall FPR | Overall FNR | ||
Uncorrected | 0.6220 | 0.1374 | ||
Corrected (comp. 1) | 0.0389 | 0.0940 | ||
Corrected (comp. 1,2) | 0.0351 | 0.0886 | ||
Table 2d. Detection Accuracy: regression-based correction | ||||
PennCNV results | Overall FPR | Overall FNR | ||
GC-percentage corrected | 0.0389 | 0.0944 |
Note: high/low noise: group with high-SD/low-SD Gaussian noise, each containing 100 samples; σLRR: overall standard deviation of the simulated LRR data for each sample; Nsub_ex: number of bad samples failed by quality control; FPR and FNR are calculated with regard to the total number of markers with CNVs.