Codecademy Logo


Functions in R

A function is a collection of several lines of code. By calling the function’s name, we execute the code of the function. We can call the function with different input arguments, or parameters, to get different results. We can dictate what is given back when the function is complete using a return() statement.

# Define a function
standardize <- function(data_vector) {
# code to run
standardized <- (data_vector - mean(data_vector))/sd(data_vector)
# dictate what to return

apply() Functions in R

We can use the apply(X, MARGIN, FUN) function to apply a function to all the elements in a data structure. X is the dataframe we want to apply the function to, MARGIN is where we specify whether to apply to rows (1) or columns (2), and FUN is the function to apply.

## Get the mean of each column in dat
apply(X = dat, MARGIN = 2, FUN = mean)
## Get the max of each row in dat
apply(X = dat, MARGIN = 1, FUN = max)

Tibbles and dplyr

Tibbles are a type of dataframe unique to the tidyverse. Tibbles can be manipulated using the 5 basic functions of the dplyr package:

  • mutate(): add or adjust whole columns
  • select(): specify which columns to keep or remove from a tibble
  • filter(): specify which rows to keep or remove from a tibble
  • summarize(): reduces one or more variables to a summary value
  • arrange(): select the order of rows in a tibble by the values in a column

Multiple steps and functions can be connected together into a single step using the pipe %>% operator.

tib %>%
# make a column for the averages of test1
summarize(avg_test1 = mean(test1)) %>%
# add new columns test1_letter and test2_letter
# assign test1_letter based on test1 average score
test1_letter = case_when(
avg_test1 < 80 ~ "C",
avg_test1 >= 80 & avg_test1 < 90 ~ "B",
avg_test1 >= 90 ~ "A")) %>%
# reorder the columns to be the average score and then letter
select(avg_test1, test1_letter)

R Control Flow

Control flow involves the program deciding which code to execute. The decision-making is established through conditional statements, i.e. if, else if, and else. Each condition should compute to a logical TRUE or FALSE. You can use comparison operators like !, & and | to combine logical values.

if (condition_to_check) {
# execute code and don't check any more conditions
} else if (other_condition_to_check & and_this_condition_to_check) {
# execute code only if both are true and don't check any more conditions
} else if (either_this_condition | or_this_condition ) {
# execute code if either condition is true and don't go to else
} else {
# the default code if none of the conditions above are true

Loops in R

A loop allows you to execute the same piece of code multiple times. Each execution is called an iteration. A for loop allows you to specify the number of iterations or go through a data structure’s length.

You can use the loop_variable inside the loop body but it only has meaning inside the loop.

A while loop on the other hand repeats code while a condition is true. You want the condition to start as true, and some value should be altered in the loop so that the condition becomes false at some point. ```

# how to define a for loop
for (loop_variable in sequence) {
# code to repeat
# how to define a while loop
while (condition_to_check_every_iteration) {
# code to repeat

Learn More on Codecademy