Using keras.layers.Embedding instead of python dictionary
Firstly, I use a function to transform words into word-embedding:
1 2 3 4 5 6 7 8 9 10 |
def text_to_array(text, embeddings_index): empty_embed = np.zeros(EMBEDDING_LENGTH, dtype = np.float32) text = text[:-1].split()[:MAX_TEXT_LENGTH] embeds = [] for x in text: em = embeddings_index.get(x) if em is not None: embeds.append(em) embeds += [empty_embed] * (MAX_TEXT_LENGTH - len(embeds)) return np.array(embeds, dtype = np.float32) |
But I noticed that it costs quite a few CPU resource while GPU usage is still low. The reason is simple: using single thread python to do search in dictionary is uneffective. We should use Embedding layer in Keras… Read more »