Integrating Multiple Interaction Networks for Gene Function Inference

. 2018 Dec 21;24(1):30. doi: 10.3390/molecules24010030

Algorithm 1: ReprsentConcat Algorithm

Input: network_files: paths to adjacency list files, n: number of genes in input networks, d: number of output dimensions, onttype: which type of annotations to use, early_stopping_rounds: number of stopping the rounds
Output: opt_pred_results: prediction results
fori=1: length( network_files)
A=load_network( network_files(i), n)
Q=rwr(A, 0.5)
R=ln(Q+1/n)
U, ∑, V =svd(R)

X_c u r = U_{d} \sum_{d}^{1 / 2}

X=hstack(X, X_cur)
end for
Y=load_annotation(onttype) //load annotations
//split the data into train data and test data
X_train, Y_train, X_test, Y_test=train_test_split(X, Y)
layer_id=0
while 1
if layer_id==0
X_cur_train=zeros(X_train)
X_cur_test=zeros( X_test)
else
X_cur_train=X_proba_train.copy()
X_cur_test= X_proba_test.copy()
end if
X_cur_train=hstack( X_cur_train, X_train)
X_cur_ test =hstack( X_cur_ test, X_ test)
for estimator in n_randomForests
//train each forest through k-fold cross validation
y_probas= estimator.fit_transform( X_cur_train, Y_train)
y_train_proba_li+= y_probas
y_test_probas= estimator.predict_proba(X_cur_ test)
y_test_proba_li+= y_test_probas
end for
y_train_proba_li /=length(n_randomForests)
y_test_proba_li /=length(n_randomForests)
train_avg_F₁=calc_F1(Y_train, y_train_proba_li) // calculate the F₁ value
test_avg_F₁=calc_F1(Y_test, y_test_proba_li)
test_F₁_list.append( test_avg_F₁)
opt_layer_id=get_opt_layer_id( test_F₁_list)
if opt_layer_id = layer_id
opt_pred_results=[ y_train_proba_li, y_test_proba_li]
end if
if layer_id - opt_layer_id >= early_stopping_rounds
return opt_pred_results
end if
layer_id+=1
end while