In the previous article, I reached mAP 0.770 for VOC2007 test.
Four months has past. After trying a lot of interesting ideas from different papers, such as FPN, celu, RFBNet, I finally realised that the data is more important than network structures. Then I use COCO2017+VOC instead of only VOC to train my model. The mAP for VOC2007 test eventually reached 0.797.
But another strange thing happen: there are will be a strange big bounding box around the whole image for the 16-birds-image. After using dropout and changing augmentation policies, the strange big box still existed.
I doubt that COCO2017 dataset for birds is not general enough. Therefore I decided to use a more abundant dataset — Open Images Dataset V5. After retrieve all bird images from Open Images Dataset V5, I get 18525 images with corresponding annotations. By using them for training, I finally got a more promising bird detection result for that 16-birds-image (by using threshold 0.65):
Seems these bird images in Open Images Dataset V5 are more general than COCO2017. But the mAP of COCO evaluation is smaller for the model trained by Open Images than model trained by COCO2017. So it looks like I need a more comprehensive evaluation metrics now.