It’s time for some deep learning!

Deep learning models in Keras are built in layers, where each layer is a step in the model.

Our encoder requires two layer types from Keras:

  • An input layer, which defines a matrix to hold all the one-hot vectors that we’ll feed to the model.
  • An LSTM layer, with some output dimensionality.

We can import these layers as well as the model we need like so:

from keras.layers import Input, LSTM from keras.models import Model

Next, we set up the input layer, which requires some number of dimensions that we’re providing. In this case, we know that we’re passing in all the encoder tokens, but we don’t necessarily know our batch size (how many chocolate chip cookies sentences we’re feeding the model at a time). Fortunately, we can say None because the code is written to handle varying batch sizes, so we don’t need to specify that dimension.

# the shape specifies the input matrix sizes encoder_inputs = Input(shape=(None, num_encoder_tokens))

For the LSTM layer, we need to select the dimensionality (the size of the LSTM’s hidden states, which helps determine how closely the model molds itself to the training data — something we can play around with) and whether to return the state (in this case we do):

encoder_lstm = LSTM(100, return_state=True) # we're using a dimensionality of 100 # so any LSTM output matrix will have # shape [batch_size, 100]

Remember, the only thing we want from the encoder is its final states. We can get these by linking our LSTM layer with our input layer:

encoder_outputs, state_hidden, state_cell = encoder_lstm(encoder_inputs)

encoder_outputs isn’t really important for us, so we can just discard it. However, the states, we’ll save in a list:

encoder_states = [state_hidden, state_cell]

There is a lot to take in here, but there’s no need to memorize any of this — you got this.💪



We’ve moved the code from the previous exercises into another file to give you some room (and to speed things up a bit for you).

The necessary modules are imported, so now it’s up to you to set up the encoder layers and retrieve the states. Ready?

First, define an input layer, encoder_inputs. Give its shape:

  • a batch size of None
  • number of tokens set to num_encoder_tokens

Build the LSTM layer called encoder_lstm with a dimensionality of 256 that will return the output state.


Call encoder_lstm on encoder_inputs to retrieve the following return values:

  • encoder_outputs
  • state_hidden
  • state_cell

Now, create a list of the two states and assign it to a new variable: encoder_states.

Sign up to start coding

Mini Info Outline Icon
By signing up for Codecademy, you agree to Codecademy's Terms of Service & Privacy Policy.

Or sign up using:

Already have an account?