After using resnet_v2_50 in tensorflow/models, I found that the inference result is totally incorrect, though the training accuracy looks very well.
Firstly, I suspected the regularization of samples:

  image = tf.image.resize_image_with_crop_or_pad(image, IMAGE_HEIGHT + 66, IMAGE_WIDTH + 66)
  image = tf.random_crop(image, [IMAGE_HEIGHT, IMAGE_WIDTH, IMAGE_CHANNELS])
  image = tf.image.random_flip_left_right(image)

Indeed I had extended the image to a too big size. But after I changing padding size to ’10’, the inference accuracy was still incorrect.
Then I checked the code about importing data:

# To avoid various formats of picture, I encode all image to 'jpeg' and write them as TFRecord
img = cv2.imread(file_name)                                          
raw_image = cv2.imencode('.jpeg', img)[1].tostring()
....
# When importing data from TFRecord
image = tf.image.decode_image(image)

and changed my inference code as the data importing routines. But the problem still existed.

About one week past. Finally, I found this issue in Github. It explains all my questions: the cause is the slim.batch_norm(). After I adding these code to my program (learning from slim.create_train_op()):

update_ops = set(ops.get_collection(ops.GraphKeys.UPDATE_OPS))
with ops.control_dependencies(update_ops):
  barrier = control_flow_ops.no_op(name='update_barrier')
total_loss = control_flow_ops.with_dependencies([barrier], total_loss)
grads = optimizer.compute_gradients(total_loss)
...

The inference accuracy is — still low. Without other choice, I removed all slim.batch_norm() in resnet_v2.py, and at this time inference accuracy becomes the same with training accuracy.
Looks problem partly been solved, but I still need to find out why sli.batch_norm() doesn’t work well in inference …