At long last, it’s translation time. Inside the test function, we’ll decode the sentence word by word using the output state that we retrieved from the encoder (which becomes our decoder’s initial hidden state). We’ll also update the decoder hidden state after each word so that we use previously decoded words to help decode new ones.
To tackle one word at a time, we need a
while loop that will run until one of two things happens (we don’t want the model generating words forever):
- The current token is
- The decoded Spanish sentence length hits the maximum target sentence length.
while loop, the decoder model can use the current target sequence (beginning with the
"<START>" token) and the current state (initially passed to us from the encoder model) to get a bunch of possible next words and their corresponding probabilities. In Keras, it looks something like this:
output_tokens, new_decoder_hidden_state, new_decoder_cell_state = decoder_model.predict( [target_seq] + decoder_states_value)
Next, we can use NumPy’s
.argmax() method to determine the token (word) with the highest probability and add it to the decoded sentence:
# slicing [0, -1, :] gives us a # specific token vector within the # 3d NumPy matrix sampled_token_index = np.argmax( output_tokens[0, -1, :]) # The reverse features dictionary # translates back from index to Spanish sampled_token = reverse_target_features_dict[ sampled_token_index] decoded_sentence += " " + sampled_token
Our final step is to update a few values for the next word in the sequence:
# Move to the next timestep # of the target sequence: target_seq = np.zeros((1, 1, num_decoder_tokens)) target_seq[0, 0, sampled_token_index] = 1. # Update the states with values from # the most recent decoder prediction: decoder_states_value = [ new_decoder_hidden_state, new_decoder_cell_state]
And now we can test it all out!
You may recall that, because of platform constraints here, we’re using very little data. As a result, we can only expect our model to translate a handful of sentences coherently. Luckily, you will have an opportunity to try this out on your own computer with far more data to see some much more impressive results.
We’ve added the
while loop in, along with the stop conditions.
Now it’s your turn to fill in the rest…
[target_seq] + decoder_states_value to get
Next up, we’ll get the most probable next word in the sequence (according to our model) and add it to the decoded sentence.
output_tokens[0, -1, :].
sampled_token_indexto find our next word in
reverse_target_features_dict. Save the result to
sampled_token, replacing the empty string (
- Add the word to
decoded_sentence. (Make sure you add a space before the word.)
target_seq as a NumPy array of zeros with dimensions
(1, 1, num_decoder_tokens).
target_seq[0, 0, sampled_token_index] a value of
1.. This one-hot vector now represents the token we just sampled.
And, finally, update
decoder_states_value with a list of the new
new_decoder_cell_state (in this order).
How does the Spanish look to you?