At long last, it’s translation time. Inside the test function, we’ll decode the sentence word by word using the output state that we retrieved from the encoder (which becomes our decoder’s initial hidden state). We’ll also update the decoder hidden state after each word so that we use previously decoded words to help decode new ones.

To tackle one word at a time, we need a while loop that will run until one of two things happens (we don’t want the model generating words forever):

  • The current token is "<END>".
  • The decoded Spanish sentence length hits the maximum target sentence length.

Inside the while loop, the decoder model can use the current target sequence (beginning with the "<START>" token) and the current state (initially passed to us from the encoder model) to get a bunch of possible next words and their corresponding probabilities. In Keras, it looks something like this:

output_tokens, new_decoder_hidden_state, new_decoder_cell_state = decoder_model.predict( [target_seq] + decoder_states_value)

Next, we can use NumPy’s .argmax() method to determine the token (word) with the highest probability and add it to the decoded sentence:

# slicing [0, -1, :] gives us a # specific token vector within the # 3d NumPy matrix sampled_token_index = np.argmax( output_tokens[0, -1, :]) # The reverse features dictionary # translates back from index to Spanish sampled_token = reverse_target_features_dict[ sampled_token_index] decoded_sentence += " " + sampled_token

Our final step is to update a few values for the next word in the sequence:

# Move to the next timestep # of the target sequence: target_seq = np.zeros((1, 1, num_decoder_tokens)) target_seq[0, 0, sampled_token_index] = 1. # Update the states with values from # the most recent decoder prediction: decoder_states_value = [ new_decoder_hidden_state, new_decoder_cell_state]

And now we can test it all out!

You may recall that, because of platform constraints here, we’re using very little data. As a result, we can only expect our model to translate a handful of sentences coherently. Luckily, you will have an opportunity to try this out on your own computer with far more data to see some much more impressive results.



We’ve added the while loop in, along with the stop conditions.

Now it’s your turn to fill in the rest…

Call decoder_model.predict() on [target_seq] + decoder_states_value to get output_tokens, new_decoder_hidden_state, and new_decoder_cell_state respectively.


Next up, we’ll get the most probable next word in the sequence (according to our model) and add it to the decoded sentence.

  • Define sampled_token_index as np.argmax() called on output_tokens[0, -1, :].
  • Use sampled_token_index to find our next word in reverse_target_features_dict. Save the result to sampled_token, replacing the empty string ("").
  • Add the word to decoded_sentence. (Make sure you add a space before the word.)

Reset target_seq as a NumPy array of zeros with dimensions (1, 1, num_decoder_tokens).

Give target_seq[0, 0, sampled_token_index] a value of 1.. This one-hot vector now represents the token we just sampled.

And, finally, update decoder_states_value with a list of the new new_decoder_hidden_state and new_decoder_cell_state (in this order).

How does the Spanish look to you?

Sign up to start coding

Mini Info Outline Icon
By signing up for Codecademy, you agree to Codecademy's Terms of Service & Privacy Policy.

Or sign up using:

Already have an account?