| Algorithm 1 Implementation: Few-Shot Model Compression Algorithm. |
|
Input: Base class dataset , validation dataset , and the novel class dataset , |
| the teacher network , the student network , |
| temperature parameter τ, hyperparameter , , . |
|
Output: The predicted value of query samples in
|
| Stage 1: Teacher network pre-training |
|
While epoch ≤ maximum number of the iteration |
| A batch of images is randomly selected from . |
| Images are fed into the backbone of the teacher network to extract the feature. |
| Obtain the base class and rotation class probability values. |
| Pre-train the teacher network according to Equation (4). |
| Stage 2: Few-shot model compression |
|
While epoch ≤ maximum number of the iteration |
| A batch of images is randomly selected from . |
| The image is separately fed into the backbone of the teacher and the student networks to extract features. |
| Obtain the base class probability values from the teacher network and the student network, respectively. |
| Calibrate the feature error distribution between the student network and the teacher network according to Equation (16). |
| Calculate the knowledge distillation loss function for intermediate features according to Equation (17). |
| Calculate the KL divergence-based loss function between the predicted output values of the student network and the teacher network according to Equation (20). |
| Calculate the cross-entropy loss function of the student network according to Equation (21). |
| Train the student network according to Equation (22). |
| Stage 3: Few-shot model testing |
|
While epoch ≤ maximum number of the iteration |
| Images from are processed through the feature extractor to obtain the feature. |
| Train classifier for the novel classes. |
| Test on the query set from . |