Table 1.
ScaleFEx | CellProfiler | Embeddings | |
---|---|---|---|
Computation time for 1 plate (AWS) | 35′ | 1h 50′ | NA |
Computation cost for 1 plate (AWS) | ∼3$ | ∼40$ | NA |
AWS machine infrastructure for 1 plate | 6∗C5.12xlarge (288 VCPU) | 200∗C5.xlarge (800 VCPU) | NA |
Output size | 16 GB CSV/8GB Parquet | ∼100 GB CSV | 2.6 GB CSV/1.6GB Parquet |
Aggregation step | None | 5h | None |
File download cost | $0.8 | ∼ $10 | NA |
Total number of features | 1861 | 3578 | 320 |
Uncorrelated features | 325 | 413 | 320 |
Number of 0 variance features | 0 | 36 | 0 |
Total used features | 325 | 377 | 320 |
Average AUC for binary drug classification | 0.91 | 0.88 | 0.92 |
Comparison between ScaleFEx, CellProfiler, and an embeddings approach across various metrics, including computation time and cost on AWS, number of features, and data correlation characteristics for a single plate. It details the performance of these tools in binary drug classification as measured by the average area under the curve (AUC).