We are trying to use faster R-CNN network (also is an example in mxnet) to automatically extract bird from pictures. But it will cost 10 seconds to recognize a bird from a picture by using CPU, which is too slow to be used in product environment. To improve the performance, I download the MKL with version-2017u4 from Intel site and install it in the server. After recompile mxnet:
make USE_BLAS=openblas USE_CUDNN=1 USE_CUDA_PATH=/usr/local/cuda-8.0/ USE_CUDA=1 USE_MKL2017=1 -j
it only cost 3~4 seconds to recognize bird from picture. MKL really works!
Using GPU to do inference is a another option. But a EC2 instance with a GPU device is much more expensive than a normal EC2 instance. So we will still using CPU in the near future.
|Instance Storage (GB)
|$0.0928 per Hour
|$0.65 per Hour