Algorithm 1 Joint Training Procedure for DeepEnhancerPPO |
-
1:
for to do
-
2:
for each batch in do
-
3:
Perform forward pass to extract features
-
4:
current_state ← model.get_feature_vector()
-
5:
action ← ppo.predict(current_state)
-
6:
model.apply_feature_mask(action)
-
7:
predictions ← model(batch)
-
8:
Optimize feature extraction and classification modules
-
9:
end for
-
10:
Train PPO agent using experiences from the current epoch
-
11:
end for
|