Everyone’s been talking about artificial intelligence (AI) as generative AI systems have become more mainstream over the past year. But the technology underlying these impressive programs isn’t new. The use of language models, computer programs that learn to understand and generate human language by analyzing large amounts of text, goes back decades.
Today, language models are used to build generative AI that can perform tons of different tasks. Here are a few fun facts that’ll give you a sense of their potential — but if you want to learn more about all the cool things you can do with AI, check out our AI courses.
One of the first language models mimicked a psychotherapist
In 1966, MIT computer scientist Joseph Weizenbaum developed ELIZA, a program that simulated conversation using natural language processing algorithms. ELIZA was groundbreaking at the time, and could simulate a number of roles (including a psychotherapist) to engage users in conversation about their problems.
Despite being simple by today’s standards, ELIZA was surprisingly effective, to the point where people formed emotional bonds with the chatbot. It was a breakthrough in natural language processing and helped pave the way for many of the impressive programs we see today.
Want to learn how to build your own chatbots? We’ll teach you how in our skill path Build Chatbots with Python.
Language models predate the Turing Test
Alan Turing, known to many as the father of AI, created his eponymous test in 1950 to answer one question: Can machines think?
But the use of language models goes back before AI was even conceptualized. At first, language models were basic statistical models that used probabilities to predict the likelihood of a given word based on the words that came before it. The earliest example is the n-gram model, which was referenced by mathematician Claude Shannon in 1948. The n-gram model uses probability and statistics to determine the likelihood of a word by those that come before it. For a more current example, check out Google’s Ngram Viewer that shows you how often given words have been used in books throughout the years.
These early language models were simple and relied on limited data, but they were an important step in the development of natural language processing.
Language models can be trained on multiple languages
There are thousands of different languages across the world, and multilingual language models are being developed to help make AI as linguistically diverse as the global populace. Multilingual language models are very important — not only because diversity in training data helps avoid bias, but also because it’s crucial that everyone is afforded the same access to AI tools and resources.
Today, Google and other search engines use AI to improve their translators, and multilingual language models are becoming increasingly popular as they help bridge language barriers and improve communication across different cultures.
Want to learn more about how computers are taught how to interpret the complexities of human language? You can explore chatbots and other applications of language models in our course Apply Natural Language Processing with Python.
Language models can recognize human emotions
Some language models are taught through a mix of supervised and unsupervised learning algorithms to recognize the emotional subtexts in undertones within text, in a process called sentiment analysis. While most of the models used for sentiment analysis can only discern between positive, negative, and neutral tones, others can recognize specific emotions like joy or sadness.
Tons of brands and businesses use sentiment analysis to get a sense of how their customers are talking about them online. This can help them offer better support, understand how customers are responding to changes or new features, and even keep an eye on competitors. You can also use sentiment analysis for something fun — like analyzing your favorite book or song lyrics for hidden themes (you can try this in our case study Analyze Taylor Swift Lyrics with Python).
Language models can be used for more than text
While language models are often trained on text data, their underlying technology can be used for other data as well. The transformers, recurrent neural networks (RNNs), generative adversarial networks (GANs), and other systems that utilize language models can be trained on audio and image data too. That means you can train a language model to recognize and generate speech or understand and describe images.
For example, AI tools like DALL-E and Midjourney can generate images based on your prompts. There are also tools like MiniGPT-4 and Microsoft Azure AI Vision that can analyze pictures and provide detailed descriptions and captions, along with various AI programs that can replicate popular vocalists and musicians.
These language models can also make it easier to learn to speak another language. By analyzing patterns in language data, language models can identify areas where people may struggle and provide tailored support. The language learning app Duolingo recently announced new features that allow people learning English, French, and Spanish to practice their skills by chatting with an AI in real time.
Different models have their own ways of learning
There’s a lot of jargon around language models that you’ll come across as you explore AI, and understanding what they mean and how different models work under the hood will help you make more informed decisions when working with AI systems and tools. Some of the most popular include:
- Foundational models: Pre-trained models used to build larger, more advanced language models.
- Generative language models: Language models that can generate text.
- Statistical language models: Language models that use probability and statistics to predict the likelihood of words based on the ones that came before it.
- Rule-based language models: Rule-based language models generate output based on given rules and guidelines.
- Neural language models: Neural language models that use deep learning algorithms and neural networks to understand and generate natural language.
- Large language models: Large language models, like ChatGPT, are neural language models that use deep learning algorithms and massive amounts of data to perform tasks like translating and summarizing text and even creative writing.
Language models can make mistakes
A common misconception many people have is that AI systems are immune to errors and bias, but language models and AI systems can and do make mistakes — especially when they’re trained with incomplete or faulty data. Language models generate content based on what they’re taught, and any biases within their training data risks being amplified in their output. It also takes a lot of work to keep them up to date. For instance, MIT scientists found that a language model reflected gender-based stereotypes — attaching feminine contexts around flight attendants and secretaries and masculine context around lawyers and judges.
Today’s language models can be used to write everything from books to code for apps and websites. With systems like ChatGPT and GitHub Copilot, there’s a lot of concern about AI taking jobs — but once you start working with AI tools, it’s clear robots won’t be replacing us any time soon.
While language models are great at generating huge amounts of well-structured text and code in no time, they have a hard time with context and replicating human ingenuity. Here’s an example of human vs. AI code that breaks it down in more detail.
Instead of replacing us, AI systems can be great partners that help improve our productivity and efficiency. Here’s an explainer on how software engineers are using AI for more info.
As language models and AI become more popular and continue to find new applications, it becomes increasingly important to understand not only when and how to use them, but also the proper way to use them. If you want to learn more about language models and how AI is being put to good use, check out our courses on machine learning and AI. We’ll show you how to start building and working with language models right away in courses like Language Models in Python: Generative Text.