|
Input reference gene expression matrix, prior to scaling. |
|
Scaled reference gene expression matrix. |
|
One-hot design matrix assigning reference cells (columns) to batches (rows). |
|
Zero matrix assigning reference cells (cols) to query batches (rows). All values are 0 because reference cells do not belong to query batches. This term is used in the derivation for the reference compression terms. |
|
Reference gene means used to center each gene for PCA. |
|
Reference gene standard deviations used to scale each gene for PCA. |
|
Gene loadings from the original PCA (before Harmony integration). |
|
Original (pre-harmonized) PC embedding for reference cells. |
|
Integrated embedding for reference cells in harmonized PC (hPC) space, as output by Harmony. |
|
Soft cluster assignment of reference cells (cols) to clusters (rows), output by Harmony. Each column is a probability distribution that sums to 1. |
|
Cluster centroid locations in the harmonized embedding, L2-normalized. |
|
3D tensor of the estimated parameters (betas and intercepts) of the linear mixture model for each of clusters for the reference cells. |
|
First reference compression term. Vector containing the size of each of the clusters, effectively the number of reference cells contained within them. |
|
Second reference compression term. |
|
Set of Symphony minimal reference elements. |