Supervised & Unsupervised Learning
Feb 01, 2019Machine learning is the field of computer science that gives computer systems the ability to learn from data — and it’s one of the hottest topics in the industry right now.
In this video, we will explore the different types of supervised learning techniques, such as regression and classification, and unsupervised learning methods, such as clustering.
We will also take a look at the concepts of supervised and unsupervised learning — and break down the differences between them.
Want to learn more? Check out the courses below:
Intro to Machine Learning: https://www.codecademy.com/learn/machine-learning?utm_source=youtube&utm_medium=organic-social&utm_content=yt_supervised_unsupervised_learning
Build a Machine Learning Model with Python: https://www.codecademy.com/learn/paths/machine-learning?utm_source=youtube&utm_medium=organic-social&utm_content=yt_supervised_unsupervised_learning
Machine Learning and AI Fundamentals: https://www.codecademy.com/learn/paths/machine-learning-ai-engineering-foundations?utm_source=youtube&utm_medium=organic-social&utm_content=yt_supervised_unsupervised_learning
#MachineLearning #SupervisedLearning #UnsupervisedLearning
- - - - -
Join the millions learning to code with Codecademy.
Learn to code: https://www.codecademy.com/?utm_source=youtube&utm_medium=organic-social&utm_content=yt_supervised_unsupervised_learning
Check out our full course catalog: https://www.codecademy.com/catalog?utm_source=youtube&utm_medium=organic-social&utm_content=yt_supervised_unsupervised_learning
We will now take a closer look at the concepts of supervised and unsupervised learning. And also understand exactly what the differences are between these two techniques. So to begin, let us take a look at an example which we have studied so far. Where we have a linear regression line, which plots the relationship between a particular cause, x, and an effect, y.
[Video description begins] A graph appears that plots Cause on the X-axis and Effect on the Y-axis. A downward slanting straight line is drawn among the cluster of plotted dots. Under this graph is the text, Find the line that best fits the underlying data. [Video description ends]
So the point of linear regression is to find a best fit straight line which models such a relationship. And this is expressed by the equation, Y = Wx + b. In a linear regression, the aim is to find those values of W and b, which best represent your data.
[Video description begins] The following instruction appears on the screen: Find the values of the weights W and bias b. [Video description ends]
This is where the learning part of machine learning comes into the picture. So whether your model is one which performs the regression or classification, you will first feed it a corpus of data. This represents your training data, and your model will go through each of the data points within it, and then make a prediction. Following that, a loss will be computed.
[Video description begins] The graphical representation shows a Corpus of data passing through a classifier, resulting in a classification. The loss is shown being sent back into the classifier. [Video description ends]
This is a measure of how far away your predicted value is from the actual one. And the aim of any training, is to minimize this loss value. At the end of this training phase, you'll end up with your trained model. Which in the case of a linear regression, could be a straight line or a plane. And in the case of a classification model, which uses support vector machines, this will be a hyperplane. In either of these cases though, a loss is computed by comparing the predicted value with the actual y value.
Now, what if you actually do not have any y values to work with? For instance, consider that you have a corpus of data which includes millions of emails, but there is no categorization available, whether those emails are spam or ham. So the machine learning techniques which we have studied so far in this learning part. Which has included classification and regressions, are examples of supervised learning. This is where our data set includes not only the input features x, but also labels y which correspond to those input features. A machine learning algorithm will go through all of the input data points and then plot a relationship y = f(x).
[Video description begins] The following formula appears on the screen: y = some_function(x). [Video description ends]
This is in fact similar to performing some form of reverse engineering in order to learn of the relationship between the input x and the output y. It is the input training dataset or the corpus which will be iterated over, often multiple times, in order to learn of the weights and biases which will represent this relationship. In the case of a linear regression, the goal of the training phase is to find the relationship, Y = Wx + b. Where W represents the weight, and b is the bias. So in short, supervised learning is where you have a set of input features and a collection of correctly labeled outputs corresponding to them.
And this data will be used in the training phase in order to learn a relationship. Which is represented by Y = f(x). This particular relationship can then be used in order to make predictions on real data later on. And both regression as well as classification are examples of supervised learning. So given that, we now move along to the concept of unsupervised learning. Where we have a collection of data, however, we need to examine that data and then generate an output without having any correct labels or identifiers to work with.
So to summarize, unsupervised learning is where your model has to learn of patterns by merely looking at the data. Once those patterns have been identified, you can then decide what course of action to take based on those results. So just to contrast supervised and unsupervised learning. In the case of supervised learning such as regression and classification, you will have training data to work with. This will include not only the features x, but also a corresponding label for each of the features y.
And it is the job of the supervised learning algorithm to learn of this relationship between x and y. It will do this by constantly adjusting the model parameters in the training phase until its loss is minimized. And now moving along to unsupervised learning, this is a situation where you do not have any training labels to work with. The task of the unsupervised learning algorithm is to structure the model in such a way so that some patterns can be gleaned from the underlying data. Once such patterns have been obtained, it is up to the user to decide how exactly to act upon them.
For example, if you have information about a lot of users on social media, and you apply some unsupervised learning algorithm on them, they may break up all of the users into groups. Where some groups represent music lovers and other groups represent sports lovers. An advertiser may wish to target the music lovers with music-themed advertisements and sports-themed advertisements for the sports fans.
So let us now summarize some of the characteristics of unsupervised learning. This is where you'll often end up with a large unlabeled dataset. That is, there are no y values to speak of. The job of the unsupervised learning algorithm is to structure a model in such a way that patterns can be gathered from the underlying data. And an important point to keep in mind is that these patterns are self-discovered by the model. There are no specific patterns which are being searched for.
And there are no correct or incorrect patterns which can be gleaned. So an example of a supervised learning model is a classifier. Which can label an email which enters your inbox as spam or ham, and then take an action accordingly. If the email is marked as spam, it will be sent to your trash folder. And if not, it will appear in your inbox. In the case of an unsupervised learning algorithm however, the input data is not labeled in any way.
The output of unsupervised learning are a collection of self-discovered patterns within the data. There are no correct or incorrect answers which are produced by an unsupervised learning algorithm. Just a collection of patterns which the user can use in order to make a further decision. So given that there are no correct or incorrect answers when it comes to applying unsupervised learning. How exactly can this help?
Well, as we have already touched upon, unsupervised learning algorithms can help you find logical groupings in your data. For example, with social media users, there may be groups of music lovers and sports lovers. And this information can help you direct some customized advertising campaigns towards each of these groups. In addition, unsupervised learning can also help you extract the significant factors in your data.
We will cover an example of that in just a little bit. So moving along to the use cases for unsupervised learning. We already saw that logical groups within the data can help you identify users who are music lovers or sports lovers. And you're likely to find patterns which you never even imagined in the first place. For example, you may find a group of teenagers who happen to be very interested in music from the 1950s.
Another common application of unsupervised learning is in the field of latent factor analysis. This will help you extract the significant factors in your underlying data. For example, if your dataset includes a number of different products and their sales quantities and a number of different factors which affect them. Then you may be able to gather all of the common factors which seem to affect the sales of a number of products.
Another common application of unsupervised learning is to categorize data which is unlabeled. So for example, if you happen to have pictures from a number of vacations of yours, an unsupervised learning algorithm may be able to identify the same flower from different photos across your vacations. You could then question whether there is some kind of subconscious force which is drawing you towards those flowers.
One more area where unsupervised learning finds significant use is in the field of fraud detection. So given a number of different transactions, often numbering in the millions, an unsupervised learning algorithm may be able to cast all of the unusual transactions into a single group. So while these transactions will not be labeled as fraudulent, you may decide that these transactions are in fact worthy of further investigation
. An unsupervised learning can in fact be a precursor to applying supervised learning techniques. For that, consider that you have some unlabeled data and you find some logical groups within them. You could then apply labels explicitly to each of these logical groups and then train a classification model. And then later use it to categorize data into these groups. Some of the commonly used unsupervised learning techniques include clustering, where data is split into a number of clusters, and the data in each cluster has a number of common attributes. You can also use this for dimensionality reduction in order to trim the number of dimensions in your data. And we have already discussed the concept of latent factor analysis.