How Machine Learning Works

Feb 01, 2019

Machine learning is the field of computer science that gives computer systems the ability to learn from data — and it’s one of the hottest topics in the industry right now.

In this clip, we will take a look at how exactly machine learning works. Specifically, how do machine learning models learn to find patterns and data in order to make accurate predictions? So you may be aware that machine learning models are able to make decisions based on the data which is supplied to them. What we need to uncover, though, is how exactly a machine learning model is able to learn from the data and then make predictions on new data later on.

To understand exactly how this happens, let us consider an example which we had mentioned a little earlier on, where we need to classify whether a particular email which is entering an inbox happens to be spam or ham. As part of the machine learning, there are two phases which any supervised learning model will go through.

[Video description begins] Screen title: Classification Using Machine Learning. [Video description ends]

The first phase is known as training. And this is where the model will be fed in with data which has already been classified. For example, this can include a collection of emails, which have already been marked as spam or ham. During the training phase, the model will learn of patterns within the data, which will help it decide later on, how to classify emails.

And once the training phase is complete, the model is now ready to make predictions on new data. That is, when it is fed in a new email which has not already been classified, it will look for the patterns it has learned during the training phase. And then decide whether that email represents spam, or is a legitimate email. We now zoom in on the training phase in machine learning. This is where a machine learning model such as a classifier will be fed

[Video description begins] Screen title: Training the Classifier. [Video description ends]

in a collection of data, which is referred to as training data or a corpus. In this case, the training data will already be labeled. For example, it'll have a spam or ham classification already. You can think of these as a set of questions and their correct answers which have been fed to the classifier. The classifier will first try to make a prediction without looking at the correct label and then see how far its prediction is from the actual result. This is known as the loss of the classifier and then this will be fed back into the classifier and on the next try, it will try to minimize the value of the loss.

As it goes through more and more data points, the loss will hopefully decrease continuously as the classifier learns of patterns in the input data. And once the training process is complete, what we end up with is a fully-trained machine learning model. At this point, when an input comes in, whether this is an email or some other form of input, it has come in without any classification label. That is, these are the features of the input or the X variables.

At this point, the classifier, having learned from the data which it has seen prior to this, will make a prediction. That is, it'll generate a label for the input data, for example, whether the email is spam or ham. These predictions, or labels, are also known as the Y values of a classifier. Once a classifier has made a prediction, it is up to the developer to decide what course of action to take following that. For example, if an email has been classified as ham, it will make it's way to your inbox and if it has been marked as spam it will be send to the trash directory.

So when any machine learning model, if fed an input data from which to make predictions, the input are referred to as features or X values. For example, in the case of spam detection, the features could be the words in the email. And another feature could be the sender field. The model may have learned, for instance, that the use of specific words or the use of a specific string in the sender's address may mean that this email is spam.

Other features in an email could also be the punctuation which is used. For example, excessive use of exclamation points may indicate that the email is not genuine. All of these are examples of what a machine learning classifier may consider as the input features. All of the input features which are grouped together are referred to as a feature vector. And these can also be called X variables. The job of a machine learning model is to take in the feature vector or the X variables and to produce a Y value or a label. That is, it will make a prediction based on the input features.

When a classification model generates a label for some input, it is effectively casting it into a particular category. So, for example, an email classifier works with two different categories, spam or ham. So the value which the model needs to predict is categorical in nature. This is in contrast to regression models where the value which needs to be predicted is continuous in nature.

For example, a regression model may be used to predict the price of a house or a stock whose value will fall into a certain range rather than categories. The output of a machine learning model, whether it is a categorical value, which is the case with classification models, or a continuous value, in the case of regression models, are referred to as Y variables. Labels, on the other hand, are also Y variables, but they're specific to classification models.

So just to summarize classification models, the data which they're fed in happen to be the features or the attributes of the data. During the training phase of the classifier, it will have learned to look for patterns in the data attributes. And once it has found patterns in the input, it will perform a categorization or classification. This categorization or classification will be in the forms of a label which it has assigned to the input data, and this is also known as the Y variable.

Depending on the type of problem being solved, the input data can be categorized into multiple classes. If there are just two options to choose from, for example, spam and ham, then the problem being solved is one of binary classifications, since the output will take a binary form.

[Video description begins] Screen title: N-Category Classification. [Video description ends]

On the other hand, if the output can fall into multiple categories. Take, for instance, where you are trying to predict the type of animal based on certain input data. So if your categories include dog, crab, and rat, then you have three categories to choose from. And if you have even more animals, then you potentially have n different categories. In such a case, the problem you are trying to solve is one of n category classification.