We are trying to use faster R-CNN network (also is an example in mxnet) to automatically extract bird from pictures. But it will cost 10 seconds to recognize a bird from a picture by using CPU, which is too slow to be used in product environment. To improve the performance, I download the MKL with version-2017u4 from Intel site and install it in the server. After recompile mxnet:
make clean
make USE_BLAS=openblas USE_CUDNN=1 USE_CUDA_PATH=/usr/local/cuda-8.0/ USE_CUDA=1 USE_MKL2017=1 -j
it only cost 3~4 seconds to recognize bird from picture. MKL really works!
Using GPU to do inference is a another option. But a EC2 instance with a GPU device is much more expensive than a normal EC2 instance. So we will still using CPU in the near future.
vCPU | ECU | Memory (GiB) | Instance Storage (GB) | Linux/UNIX Usage | |
t2.large | 2 | Variable | 8 | EBS Only | $0.0928 per Hour |
g2.2xlarge | 8 | 26 | 15 | 60 SSD | $0.65 per Hour |