Now our model is ready for testing! Yay! However, to generate some original output text, we need to redefine the seq2seq architecture in pieces. Wait, didn’t we just define and train a model?
Well, yes. But the model we used for training our network only works when we already know the target sequence. This time, we have no idea what the Spanish should be for the English we pass in! So we need a model that will decode step-by-step instead of using teacher forcing. To do this, we need a seq2seq network in individual pieces.
To start, we’ll build an encoder model with our encoder inputs and the placeholders for the encoder’s output states:
encoder_model = Model(encoder_inputs, encoder_states)
Next up, we need placeholders for the decoder’s input states, which we can build as input layers and store together. Why? We don’t know what we want to decode yet or what hidden state we’re going to end up with, so we need to do everything step-by-step. We need to pass the encoder’s final hidden state to the decoder, sample a token, and get the updated hidden state back. Then we’ll be able to (manually) pass the updated hidden state back into the network:
latent_dim = 256 decoder_state_input_hidden = Input(shape=(latent_dim,)) decoder_state_input_cell = Input(shape=(latent_dim,)) decoder_states_inputs = [decoder_state_input_hidden, decoder_state_input_cell]
Using the decoder LSTM and decoder dense layer (with the activation function) that we trained earlier, we’ll create new decoder states and outputs:
decoder_outputs, state_hidden, state_cell = decoder_lstm(decoder_inputs, initial_state=decoder_states_inputs) # Saving the new LSTM output states: decoder_states = [state_hidden, state_cell] # Below, we redefine the decoder output # by passing it through the dense layer: decoder_outputs = decoder_dense(decoder_outputs)
Finally, we can set up the decoder model. This is where we bring together:
- the decoder inputs (the decoder input layer)
- the decoder input states (the final states from the encoder)
- the decoder outputs (the NumPy matrix we get from the final output layer of the decoder)
- the decoder output states (the memory throughout the network from one word to the next)decoder_model = Model( [decoder_inputs] + decoder_states_inputs, [decoder_outputs] + decoder_states)
As you may have noticed, we moved everything to another file again. We also saved the training model on its own as an HDF5 file, which we are loading back up in script.py.
We already created the encoder test model and built two decoder state input layers.
Before we get into the decoder side, put those two state input layers into a list called
decoder_states_inputs — first the hidden, then the cell.
Now that we have the decoder state inputs organized, we can pass them, along with the decoder inputs, through the decoder LSTM layer to get testing decoder outputs and states.
decoder_lstm() with the following arguments:
Save the resulting return values to
Then save these two new states into a new list:
Time to redefine the
decoder_outputs you got from the
Pass this output through the
decoder_dense() layer and save the resulting value back to
And finally we can create the decoder test model!
Build a Keras model called
decoder_model with the following arguments passed in:
[decoder_inputs] + decoder_states_inputs
[decoder_outputs] + decoder_states