When I run code like:
with tf.device('/GPU:0'):
images = tf.random_crop(images, [IMAGE_HEIGHT, IMAGE_WIDTH, IMAGE_CHANNELS])
...
it reports:
Could not satisfy explicit device specification '/device:GPU:0' because no supported kernel for GPU devices is available.
Looks operation tf.random_crop() doen’t have CUDA kernel implementation. Therefore I need to write it myself. The solution is surprisingly simple: write a function to do random_crop on one image by using tf.random_uniform() and tf.slice(), and then use tf.map_fn() to apply it on multi-images.
def my_random_crop(value, size):
shape = tf.shape(value)
size = tf.convert_to_tensor(size, dtype = tf.int32)
limit = shape - size + 1
offset = tf.random_uniform(tf.shape(shape), dtype = size.dtype, maxval = size.dtype.max) % limit
return tf.slice(value, offset, size)
...
images = tf.map_fn(lambda img: my_random_crop(img, [IMAGE_HEIGHT, IMAGE_WIDTH, IMAGE_CHANNELS]), images)
It can run on GPU now.