Codecademy Logo

Generative Chatbots

Generative Chatbots

A generative chatbot is an open-domain chatbot program that generates original combinations of language rather than selecting from pre-defined responses. seq2seq models used for machine translation can be used to build generative chatbots.

Dataset choice with generative chatbots

Choosing the right dataset is a common issue when creating generative chatbots. Common concerns include:

  • Data source (e.g., Twitter, Slack, customer service conversations, etc.)
  • Authentic dialog vs. fictional dialog
  • License of dataset
  • Biases, bigotry, or rudeness within the dataset

Generative chatbot input format

For generative chatbots using a Keras-based seq2seq model, it is necessary to convert user input into NumPy matrices of one-hot vectors.

# the following function converts
# user input into a NumPy matrix:
def string_to_matrix(user_input):
tokens = re.findall(r"[\w']+|[^\s\w]", user_input)
user_input_matrix = np.zeros(
(1, max_encoder_seq_length, num_encoder_tokens),
dtype='float32')
for timestep, token in enumerate(tokens):
if token in input_features_dict:
user_input_matrix[0, timestep, input_features_dict[token]] = 1.
return user_input_matrix

Generative chatbot unknown word handling

There are several solutions to handling unknown words for generative chatbots including ignoring unknown words, requesting that the user rephrase, or using <UNK> tokens.

Handling context for generative chatbots

Generative chatbot research is currently working to resolve how best to handle chat context and information from previous turns of dialog. Some proposed solutions to this include:

  • training the model to hang onto some previous number of dialog turns
  • keeping track of the decoder’s hidden state across dialog turns
  • personalizing models by including user context during training or adding user context as it is included in the user input

Learn more on Codecademy