Algorithm 1.

Alternating Imputation and Correction Method (AICM)

Hyperparameter: Dropping rate r, maximum iteration iter, regularization term λ_r, and hard constraint term λ_h.

Input: Two data matrices, of both n drugs and p cell-lines with summarized sensitivity data, denote as

A, B \in ℝ^{n \times p}

. We denote jth column of two matrices as a^j; b^j, j ∈ {1, 2,…,p} respectively. We denote the entry at ith row and jth column as A_ij and B_ij respectively, {i, j} ∈ {1, 2,…,n} × {1, 2,…, p}.

Initialization: For each j ∈ {1, 2,…,p} for all i ∈ {1, 2,…,n} such that B_ij is missing while A_ij is not, we denote such set as

B_{i j}^{NA}

, we fit a linear model such that α_j, β_j maximizes

{‖ b^{j} - α_{j} a^{j} + β_{j} ‖}_{2}

and then impute the missing values as

B_{i j}^{NA} = α_{j} A_{i j} + β_{j}

. Then swap the role of A and B and repeat the above process. Now we have two matrices with same missing indices.

for k in {1, 2,… Iter} do

Swap: A → B, B → A.

Drop: Randomly drop r × n × p data uniformly from A, we denote the indices of the dropped data as

D \subseteq {1, 2, \dots, n} \times {1, 2, \dots, p}

, and hence dropped data as a set

A^{DR} : = {\cup_{{i, j} \in D} A_{i j}}

. In a similar fashion, we denote dropped data of column k as

a_{DR}^{k} : = {\cup_{{i, j} \in D, \forall i s.t. j = k} A_{i j}}

, we denote the corresponding data in kth column of B as

b_{ADR}^{k}

. We fit a set of parameters

α_{j} \in ℝ

β_{j} \in ℝ

for each j with the following objective function:

min_{α_{j}, β_{j}} \frac{1}{n} {‖ b^{j} - (α_{j} a^{j} + β_{j}) ‖}_{2} + λ_{r} {‖ a_{DR}^{j} - (α_{j} b_{ADR}^{j} + β_{j}) ‖}_{\infty} (4)

Correction: Set

a_{DR}^{j} = α_{j} b_{ADR}^{j} + β_{j}

or each j. We denote the set of corrected value as

{A^{IMP}} = \cup_{j = 1}^{p} {a_{DR}^{j}}

Threshold: For

{i, j} \in D

, we set{A^IMP}_ij to

{A^{IMP}}_{i j} = max (min (A_{i j}, (1 - λ_{h}) A_{i j}), (1 + λ_{h}) A_{i j}) (5)

end for