Algorithm 1 Feature self augmentation process |
-
1:
# I Input Image
-
2:
# T Input Text
-
3:
# F ← Image_Encoder()
-
4:
# T ← Feature_Extractor()
-
5:
# A ← Adaptor()
-
6:
# N ← Feature_Filter()
-
7:
pretrain_init(F)
-
8:
for each x in data_loader do
-
9:
# Asymmetry constraint
-
10:
# extract feature representations of different modes
-
11:
I_f = image_encoder(I)
-
12:
T_f = text_encoder(T)
-
13:
# Loss function
-
14:
loss_cl = cross_entropy_loss (I_f, T_f)
-
15:
loss_IRC = cross_entropy_loss (I_f, I_f) - cross_entorpy_loss (T_f, T_f)
-
16:
loss = loss_(cl) +loss_(IRC)
-
17:
F ← F.detach()
-
18:
update(T, D)
-
19:
end for
|