Before we go any further, let’s stop to understand when the data gets bound to the visualization:
ggplot()function call. You can include the named argument like
ggplot(data=df_variable)or simply pass in the data frame like
+plus sign, all have access to the data frame and can use the column names as variables.
For example, assume we have a data frame
sales with the columns
profit. In this example, we assign the data frame
sales to the
ggplot() object that is initailized:
viz <- ggplot(data=sales) + geom_point(aes(x=cost, y=profit)) viz # renders plot
In the example above:
salesassigned to it
geom_pointlayer used the
profitcolumns to define the scales of the axes for that particular geom. Notice that it simply referred to those columns with their column names.
Note: There are other ways to bind data to layers if you want each layer to have a different dataset, but the most readable and popular way to bind the dataframe happens at the
ggplot() step and your layers use data from that dataframe.
Create a new variable named
viz and assign it the value of a new ggplot object that you create by invoking the
ggplot() call and assigning it the dataframe
movies as the
data argument. After you’ve defined
viz you need to state the variable name on a new line in order to see it.
Click run and watch your code render an empty canvas. Even though no data is displayed, the data is bound to the
viz ggplot object!