Skip to main content
. 2025 Sep 15;17(18):3005. doi: 10.3390/cancers17183005
Algorithm 1: ViT-DCNN (Vision Transformer with Deformable Convolution) for Lung and Colon Cancer Classification
1: Input: D = {(Xi, Yi)}, α, T, B, θViT, θDConv, N
2: Initialize: θViT, θDConv
3: for epoch = 1 to T do
4:   for batch = 1 to NB do
5:         Extract mini-batch:
6:                  Xbatch,YbatchXi,Yii=batch
7:                      Apply Data Augmentation:
8:                       Xaug,YaugAugmentXbatch,Ybatch
9:            Vision Transformer (ViT) Forward Pass:
10:            Patch Embedding:
11:                     Pi=FlattenXi·Wemb+bemb
12:            Positional Encoding:
13:                    Zi = Pi + PEi
14:            Multi-Head Self-Attention:
15:                    AttentionQ,K,V=softmaxQKTdkV
16:         Feed-Forward Network:
17:                 FFN(Z) = max(0, ZW1 + b1) W2 + b2
18:          Deformable Convolution Forward Pass:
19:           Deformable Convolution:
20:                     yij=m=1Mn=1Nxi+m+mij,j+n+nij·wmn
21:              Offset Learning:
22:                      mij,nij=ConvFij,FijRC
23:              Spatial Attention:
24:                      Frefined=Fij·Aij
25:          Hierarchical Feature Fusion (HFF):
26:              Concatenate Vision Transformer and Deformable CNN Features:
27:                     Fconcat = concat(FViT, FDConv)
28:              Squeeze-and-Excitation (SE) Block:
29:                      s=σ(MLP(z))
30:              Refined Feature Map:
31:                      Fse=Fconcat·s
32:          Prediction and Softmax Activation:
33:              Global Average Pooling:
34:                      Ζ=GAPFconcat=1HWi=1Hj=1WFconcati,j
35:              Softmax Layer:
36:                      Pc=Softmax(W·z+b)
37:              Predicted Class:
38:                      YargmaxPsoftmax
39:          Compute Class:
40:              Cross-Entropy Loss:
41:                      Lcross=i=1Nj=1Kyijlogyij
42:              Gradient Computation:
43:                      θVITLcross,θDConvLcross
44:          Parameter Update (Using AdamW optimizer):
45:              Update the Vision Transformer parameters:
46:                      θVITθVITα·θVITLcross
47:              Update the Deformable CNN parameters:
48:                     θDComvθDConvα·θDConvLcross
49:         end for
50:  end for
51:  Output:
52:  Trained ViT-Deformable CNN model with updated parameters θViT and θDConv