Summary of the 15 methods we compare based on simulation studies, including Ideal, Unadjusted, and 13 variants of PCA, SVA, PEER, and HCP (Additional file 1: Section S4). Out of the 15 methods, we select a few representative methods (Section 5.2) for detailed comparison in Simulation Design 2, the abbreviations of which are shown in (D). Y denotes the gene expression matrix, denotes the residual matrix outputted by PEER, denotes the known covariate matrix, and denotes the hidden covariate matrix. In Line 3, PCA is run on Y directly; in Line 4, PCA is run after the effects of are regressed out from Y (Additional file 1: Section S4). The addition signs in (C) denote column concatenation. “filtered” means that we filter out the known covariates that are captured well by the inferred covariates (unadjusted ); this filtering is only needed when the hidden variable inference method in (A) does not explicitly take the known covariates into account