Google has published their quantization method on this paper. It use int8 to run feed-forward but float32 for back-propagation, since back-propagation need more accurate to accumulate gradients. I got a question right after reading the paper: why all the performance test works are on platform of mobile-phone (ARM architecture)? The quantization consequences of model in google’s method doesn’t only need addition and multiplication of int8 numbers, but also bit-shift operations. The AVX instruments set in Intel x86_64 architecture could accelerate MAC (Multiplication, Addition and aCcumulation), but couldn’t boost bit-shift operations.
To verify my suspicion, I wrote a model with ResNet-50 (float32) to classify CIFAR-100 dataset. After running a few epochs, I evaluate the speed of inference by using my ‘’. The result is:

Time: 5.58819s

Then, I follow these steps to add tf.contrib.quantize.create_training_graph() and tf.contrib.quantize.create_eval_graph() into my code. This time, the speed of inference is:

Time: 6.23221s

A little bit of disappointment. Using quantized (int8) version of model could not accelerate processing speed of x86 CPU. May be we need to find other more powerful quantization algorithm.

from input_data import Cifar100Data
import tensorflow as tf
import numpy as np
import resnet_v2
import argparse
import time
import sys
BATCH_SIZE = 10000
MODEL_PATH = './models/'
MODEL_NAME = 'cifar_resnet_50'
def cnn_part(images):
    ivg, _ = resnet_v2.resnet_v2_50(images, 100)
    return ivg
def main(_):
    with tf.device('/cpu:0'):
        images = tf.placeholder(tf.float32, [BATCH_SIZE, 32, 32, 3])
        labels = tf.placeholder(tf.int64, [BATCH_SIZE])
    with tf.contrib.slim.arg_scope([tf.contrib.slim.conv2d],
                        weights_initializer = tf.truncated_normal_initializer(mean = 0, stddev = 0.1)):
        image_vector = cnn_part(images)
    loss = tf.losses.sparse_softmax_cross_entropy(labels = labels, logits = image_vector)
    loss = tf.reduce_mean(loss)
    opt = tf.train.AdamOptimizer(1e-3)
    train_op = tf.contrib.slim.learning.create_train_op(loss, opt)
    correct_prediction = tf.equal(tf.argmax(image_vector, 1), labels)
    correct_prediction = tf.cast(correct_prediction, tf.float32)
    accuracy = tf.reduce_mean(correct_prediction)
    data = Cifar100Data('/disk3/cifar/cifar-100-python/test')
    saver = tf.train.Saver()
    with tf.Session() as sess:
        with tf.gfile.FastGFile('./models/cifar_resnet_50_quant.pb') as fl:
            graph_def = tf.GraphDef()
        tf.import_graph_def(graph_def, name = '')
        saver.restore(sess, MODEL_PATH + MODEL_NAME + '-' + str(FLAGS.epoch))
        batch = data.next_batch(BATCH_SIZE)
        for i in range(3):
            begin = time.time()
            res =, feed_dict = {images: batch[0], labels: batch[1]})
            print("Time: %gs" % (time.time() - begin))
if __name__ == '__main__':
    parser = argparse.ArgumentParser()
    parser.add_argument('--epoch', type=str,
                        help='Epoch of checkpoint for evaluation')
    FLAGS, unparsed = parser.parse_known_args() = main, argv = [sys.argv[0]] + unparsed)