1. Alex used the ‘SEE-ResNeXt50’. Instead, I used the standard ‘ResNeXt50’. Maybe this is the reason why my score ‘0.9716’ in public leaderboard is not as good as Alex’s. After the competition, I did spend some time to read the paper about ‘SE-ResNeXt50’. It’s really a simple and interesting idea about optimizing the architecture of the neural network. Maybe I can use this model on my next Kaggle competition.
2. In this competition, I split the training dataset into ten folds and train three different models on different train/eval splits. After ensembled these three models, it could get a nice score. Seems Bagging is a good method on practical application.
3. After training model to a ‘so far so good’ f1-score by using SGD with ReduceOnPlateu in Keras, I use this model as the ‘base model’ for following fine-tuning. By ensemble all high-score finetuning models, I eventually get the best score. This strategy comes from the Snapshot Ensembles.
4. By the way, ReduceOnPlateu is really useful when using SGD as the optimizer.