Codecademy Logo

How ChatGPT Works

Generative AI

Generative AI refers to algorithmic systems that are designed to generate data in the form of text, images, video or audio by training on large datasets.

Generative AI: Training Data

Generative AI need to learn from patterns in existing text/image/video/audio data to be able to generate the same. The data that they are trained on is referred to as training data.

An image showing text, images, and websites are examples of training data.

Encoding & Decoding

The process of converting non-numerical data like text/images to lists of numbers that can be read by a machine is referred to as encoding. The reverse process of converting machine output to be human readable is referred to as decoding.

Generative AI: Training

During the training phase of a Generative AI model, the encoded training data is used to generate probability distributions that quantify the underlying structure within the data.

An image showing that training data is used to create a probability distribution.

Generative AI: Learning & Feedback

Generative AI can iteratively get better at the tasks given to it by using reinforcement learning, human feedback and unsupervised learning techniques.

An image showing extra learning and filtering occurs after training.

AI “Hallucinations”

The GPT models can sometimes generate wrong or irrelevant text responses to a prompt due to a variety of reasons including and not limited to ambiguous input, lack of specific knowledge and inherent model limitations. This is often referred to as “hallucinations”.

Learn more on Codecademy