Each column of a data frame can hold items of the same data type. The data types that R uses are: character, numeric (real or decimal), integer, logical, or complex. Often, we want to convert between types so that we can do better analysis. If a numerical category like
"num_users" is stored as a vector of
characters instead of
numerics, for example, it makes it more difficult to do something like make a line graph of users over time.
To see the types of each column of a data frame, we can use:
str() displays the internal structure of an R object. Calling
str() with a data frame as an argument will return a variety of information, including the data types. For a data frame like this:
the data types would be:
#> $ item: chr #> $ price: chr #> $ calories: num
We can see that the
price column is made up of
characters, which will probably make our analysis of price more difficult. We’ll look at how to convert columns into numeric values in the next few exercises.
Let’s inspect the data types in the
Print out the structure of
If we wanted to make a scatterplot of
age vs average exam score, would we be able to do it with this type of data?
Paste the following code in the last code block to try and print out the mean of the
score column of
students %>% summarise(mean_score = mean(score))
What warning do you see?