Using Single Shot Detection to detect birds (Episode one)

SSD (Single Shot Detection) is a type of one-stage object detection neural network which uses multi-scale feature maps for detecting. I forked the code from ssd.pytorch, and added some small modifications for my bird-detection task.

I have tried some different types of rectifier function at first, such as ELU and RRelu. But they only reduced the mAP (mean Average Precision). I also tried to change the hyperparameters about augmentation. But it still didn’t work. Only after I enabled the batch normalization by this patch, the mAP has been boosted significantly (from 0.658 to 0.739).

The effect looks like:

bird detection
Image 1.

bird detection
Image 2.

But actually, we don’t need all types of annotated objects. We only need annotated bird images. Hence I change the code to train the model with only bird images in VOC2007 and VOC2012. Unexpectedly, the mAP is extremely low, and the model can’t even detect all 16 bird heads in the above [Image 2].

Why using bird images only will hurt the effect? There are might be two reasons: first, a too small number of bird images (only 1000 in VOC2007 and VOC2012); second, not enough augmentations.

To prove my hypothesis, I found CUB-200, a much larger dataset for bird images (about 6000). After training by this dataset, the effect is unsatisfied also: it can’t detect all three birds in [Image 1]. I need more experiments to find the reason.

Some tips about PyTorch and Python

1. ‘()’ may mean tuple or nothing.

The result is:

2. Unlike TensorFlow’s static graph, PyTorch could run neural network just as the code. This means a lot of conveniences. The first advantage, we could print out any tensor in our program, no matter in prediction or training. Second, just adding ‘time.time()’ in code, could help us profiling every step of training.

3. Follow the example of NVIDIA’s apex, I wrote a prefetcher to let PyTorch loading data and computing parallelly. But in my test, the ‘data_prefetcher’ actually hurt the performance of training. The reason may be my model (VGG16) is not dense enough, thus computing cost less time than loading data.

How to writing papers with Markdown

Last weekend I exported my Jupyter Notebook records into a PDF format file. Surprisingly, the PDF file looks so good that I begin to think about using Jupyter Notebook or Markdown instead of LaTex to write technical papers because LaTex is an extremely powerful but inconvenient tool for writing. Then I created a file named ‘’:

Then using a command line to convert the Markdown file to PDF (if you meet problems like ‘Can’t find *.sty’, just use ‘sudo tlmgr install xxx’):

The PDF file looks like:


It does works, but the appearance looks too rigid. Then I found the ‘pandoc-latex-template‘. By downloading and installing the ‘eisvogel.tex’, I can generate PDF by:

And the new style looks as below:


Actually, we can use this template more heavily. Change ‘’ to:

Add a file ‘metadata.yaml’ for font:

Then the command line:

The final document looks much more formal:


Summaries for Kaggle’s competition ‘Histopathologic Cancer Detection’

Firstly, I want to thank for Alex Donchuk‘s advice in discussion of competition ‘Histopathologic Cancer Detection‘. His advice really helped me a lot.

1. Alex used the ‘SEE-ResNeXt50’. Instead, I used the standard ‘ResNeXt50’. Maybe this is the reason why my score ‘0.9716’ in public leaderboard is not as good as Alex’s. After the competition, I did spend some time to read the paper about ‘SE-ResNeXt50’. It’s really a simple and interesting idea about optimizing the architecture of the neural network. Maybe I can use this model on my next Kaggle competition.

2. In this competition, I split the training dataset into ten folds and train three different models on different train/eval splits. After ensembled these three models, it could get a nice score. Seems Bagging is a good method on practical application.

3. After training model to a ‘so far so good’ f1-score by using SGD with ReduceOnPlateu in Keras, I use this model as the ‘base model’ for following fine-tuning. By ensemble all high-score finetuning models, I eventually get the best score. This strategy comes from the Snapshot Ensembles.

4. By the way, ReduceOnPlateu is really useful when using SGD as the optimizer.

Problems about using DistCp on Hadoop

After installing all Hadoop environment, I used DistCp to copy large files in distributed cluster. But it report error:

Seems it can’t even find the basic MapReduce class. Then I checked CLASSPATH for Hadoop:

Pretty strange, the HADOOP_CLASSPATH contains ‘mapreduce’ directories. It supposed to be able to find ‘Job’ class, unless the MapReduce jar package is in other directories.
Finally, I found the real MapReduce jar is actually in other position. Therefore I add these directories into HADOOP_CLASSPATH: edit ~/.bashrc and add following line

DistCp could work now.

Experiencing TensorCore on RTX 2080 Ti

RTX 2080 Ti
My colleague’s bare metal PC with three-fans-RTX-2080-Ti

My previous colleague Jian Mei has bought a new GPU – RTX 2080 Ti for training bird images of After he installed all the power supply and GPU on his computer, I began to run my MobileNetV2 model on it. Unsurprisingly, the performance doesn’t boost significantly: training speed increase from 60 samples/sec to about 150 samples/sec.
The most possible reason for the poor performance is the TensorCore.


To use the full power of TensorCore, or my colleague’s RTX 2080 TI, only following the guide <Mixed Precision Training> is not enough. I directly used the complete code example from Nvidia’s Github.
By using the ResNeXt50 model from the example, the TensorCore do promote the performance of training:

Float32 Float16 (TensorCore)
Performance(samples/sec) 40 79

In the document of Nvidia, it reports 20 times performance enhancement by TensorCore. But in our RTX 2080 Ti, it only gained 2 times performance. Actually, the mainstream neural network models, such as ResNet/Densenet/Nasnet, couldn’t use up a highend GPU of Nvidia, since its too strong coumputation power for floating point. To produce the best results of TensorCore, I need to try more complicated and dense model continously.

Using XGBoost to predict large sparse data

For using XGBoost to predict, I wrote code like this:

But it reported error:

Seems csr_matrix in SciPy is not supported by XGBoost. Maybe I need to transfer sparse data to dense:

But it still reported:

The ‘test’ data is too big so it cann’t even be transfered to dense data!
XGBoost doesn’t support the sparse format, and my sparse data cannot be changed to dense. Then what should I do?

Actually, the solution is incredible simple — just use XGBoost’s DMatrix!

Some summaries for Kaggle’s competition ‘Humpback Whale Identification’

This time, I only spent one month on competition “Humpback Whale Identification”. But still, get a little step forward than previous competitions. Here are my summaries:

1. Do review ‘kernels’ in competition page, this will teach me a lot of information and new technology. By using Siamese Network rather than classic model, I eventually beat overfit problems. Thanks for suggestions from the ‘kernel’ page of competition.

2. Bravely use cutting-edge model, such as ResNeXt50 / Densenet121. They are more powerful and easy to use.

3. Do use fine-tuning. Don’t train model from scratch every time!

4. Ensemble learning is really powerful. I have used three different models to ensemble the final result.

There are also some tips for future challenge (may be correct, may be wrong):

1. albumentations is handful library for image augmentations

2. Cosine-decay-learning-rate performs worse than Exponential-decay-learning-rate

3. LeakyRelu doesn’t work significantly better than Relu

4. Bigger image size may not lead to higher accuracy

Using ResNeXt in Keras 2.2.4

      3 Comments on Using ResNeXt in Keras 2.2.4

To use ResNeXt50, I wrote my code as the API documentation for Keras:

But it reported errors:

That’s weird. The code doesn’t work as documentation said.
So I checked the code of Keras-2.2.4 (the version in my computer), and noticed that this version of code use ‘keras_applications’ instead of ‘keras.applications’.
Then I changed my code:

But it reported another error:

Witout choice, I had to check code of ‘/usr/lib/python3.6/site-packages/keras_applications/’ too. Finally, I realise the ResNeXt50() function need three more arguments:

Now the program could run ResNeXt50 model correctly. This github issue explained the detail: the ‘keras_applications’ could be used both for Keras and Tensorflow, so it needs to pass library details into model function.

Some tips about using Keras

      No Comments on Some tips about using Keras

1. How to use part of a model

The ‘img_embed’ model is part of ‘branch_model’. We should realise that ‘Model()’ is a heavy cpu-cost function so it need to be create only once and then could be used many times.

2. How to save a model when using ‘multi_gpu_model’

We should reserve original model. And only by using it, we can save the model to file.