Table 1.
Comparison of the performance of some automated image identification systems prior to CNNs and some recent state-of-the-art CNN-based methods on two popular fine-grained data sets (i.e., data sets with categories that are similar to each other), Bird-200-2011 (Wah et al. 2011), and Flower-102 (Nilsback and Zisserman 2008)
Methods | Bird | Flower | References |
---|---|---|---|
Pre-CNN methods | |||
Color+SIFT | 26.7 | 81.3 | (Khan et al., 2013) |
GMaxPooling | 33.3 | 84.6 | (Murray and Perronnin, 2014) |
CNN-based techniques | |||
CNNaug-SVM | 61.8 | 86.8 | (Razavian et al., 2014) |
MsML | 67.9 | 89.5 | (Qian et al., 2014) |
Fusion CNN | 76.4 | 95.6 | (Zheng et al., 2016) |
Bilinear CNN | 84.1 | — | (Lin et al., 2015) |
Refined CNN | 86.4 | — | (Zhang et al., 2017) |
Note: All CNN-based methods used pretrained VGG16 and transfer learning (Simonyan and Zisserman 2014). Numbers indicate the percentage of correctly identified images in the predefined test set, which was not used during training.