How Does Artificial Intelligence (AI) Work?
In this video, we'll explain how artificial intelligence works and identify the types of data that are used with AI. AI works by using large amounts of data, processing it iteratively with intelligent algorithms to learn from patterns or features in the data. It's important to use valid data and correct biases. Data can be classified into four major groups: structured, semi-structured, quasi-structured, and unstructured. Structured data is organized in tables, like in Microsoft Excel, and is easy to manipulate. Semi-structured data is labeled but not in table format, allowing for more versatility. Quasi-structured data has some patterns but lacks clear labels or structure, requiring more formatting. Unstructured data, like videos, podcasts, and pictures, lacks any pre-defined format. Big data is characterized by high volume (in terabytes, petabytes, or exabytes), high velocity (flowing rapidly from sources), and high variety (coming from different formats and sources).
So how does AI work? AI works by combining large amounts of data with fast, iterative processing and intelligent algorithms, allowing the software to learn automatically from patterns or features in the data. The important thing to note here is that AI will only learn from the data that it has, so as you use algorithms to make decisions, make sure that the data is valid and that any biases are accounted for and corrected.
Today you can collect data in many formats. We can classify data into four major groups, structured, semi-structured, quasi-structured and unstructured. Let's look at the characteristics of each type of data as well as some examples. Structured data is probably the data format that looks most familiar to you. This type of data is clearly labeled and organized in a neat table. [Video description begins] A table displays on-screen with four columns and five rows. It contains numerical data. [Video description ends]
Microsoft Excel is an example of structured data that you may have seen and even used before. In terms of advantages, it's easy to manipulate and display. However, because it is so rigid, it's not suitable for many data sources that can't be quickly categorized into rows and columns. One step beyond structured data is semi-structured data. This format is labeled and can be found in a nested style.
While it's organized, it's not in a table format, so it's a bit more versatile and can incorporate different data sources without needing to change the structure. [Video description begins] A part of code interface displays on-screen. It contains data and metadata about books. [Video description ends] It's important to note that this flexibility can become unwieldy, so you should be mindful of the number of attributes to include. Examples include email metadata and XML. Next on the list is quasi-structured data. This has some patterns in the way that it's presented, but it doesn't come with clear labels or structure.
It doesn't have metadata like semi-structured data, so it requires more work to format and sort through. Quasi-structured data includes clickstream data [Video description begins] A part of interface displays on-screen. It contains two dates and two commands. [Video description ends] and Google search results. Last but not least is unstructured data, which is considered to be the most abundant type of data that exists today. This is data that doesn't have any pre-defined format.
When we think about the wealth of information on the Internet today, such as videos, podcasts, and pictures, all of those formats are considered unstructured. While it allows us to look at more data, it does take a lot of time and effort to format the information for analysis. So what exactly is big data? The definition of it has been described as the three V's. Characteristics of big data include high volume.
Typically, the size of big data is described in terabytes, petabytes, and even exabytes, much more than could be on a regular laptop. High velocity. Big data flows from sources at a rapid and continuous pace. And high variety. Big data comes in different formats from heterogeneous sources. So if you're wondering if you're working with big data, see if those criteria fit within the information that you're working with.
More videos
Learn more on Codecademy
- Free course
Machine Learning: Artificial Intelligence Decision Making with Minimax
Teach computers how to make decisions and play games with the Minimax Algorithm!Advanced2 hours - Free course
Learn Recommender Systems
Leverage machine learning to make recommendations with recommender systems.Intermediate< 1 hour