Learn

The first step of any data analysis is importing datasets. We’ve loaded the code to do this in the first cell of the exercise notebook. Let’s break this code down line by line!

Tip: we’ll be using the term syntax as shorthand for the structure of computer code. When we say “this is the syntax to do xyz”, what we mean is “here are the correct commands in the correct order to do xyz using Python code.”

Importing pandas

Pandas is an extra set of tools in Python called a library. To use these extra tools, we have to tell Python that we’ll be using pandas. The syntax to use any Python library is

import library_name as alias

where

  • library_name is the name of the library
  • alias is a nickname or abbreviation we’ll use

Most data scientists use the abbreviation pd for pandas, so our code is

import pandas as pd

Importing the Dataset

Now that pandas is imported, we can import our dataset using the pandas .read_csv() function. The syntax to import a CSV dataset is

dataset_name = pd.read_csv('filename.csv')

where

  • dataset_name is the name we want to call the dataset in our code
  • filename.csv is the CSV file containing the data

In our case, we’ll call the dataset repair and the file is named repair.csv. So our code is

repair = pd.read_csv('repair.csv')

Previewing the Dataset

While not strictly part of the import process, it is always a good idea to preview a dataset immediately after importing it. Pandas provides a method for this called .head(), which displays the first five rows of the dataset. The syntax for .head() is

dataset_name.head()

where

  • dataset_name is whatever we named the dataset when we imported it

Our dataset is named repair, so we’ll use the code

repair.head()

How to Use Your Jupyter Notebook:

  • You can run a cell in the Notebook to the right by placing your cursor in the cell and clicking the Run button or the Shift+Enter/Return keys.
  • When you are ready to evaluate the code in your Notebook, press the Save button at the top of the Notebook or use the control/command+s keys before clicking the Test Work button at the bottom. Be sure to save your solution code in the cell marked ## YOUR SOLUTION HERE ## or it will not be evaluated.
  • When you are ready to move on, click Next.

Screenshot of the buttons at the top of a Jupyter Notebook. The Run and Save buttons are highlighted

Instructions

1.

We’ve already imported and displayed the repair dataset for you. Below that cell are three cells corresponding to these three checkpoints.

In the first of those cells, import the pandas library using the alias pd (note that there won’t be any output displayed).

Make sure to run and then save the notebook before selecting Test Work!

2.

We’ve created a laptop-specific version of the repair dataset stored in a file named laptops.csv. Complete the code to import the dataset in laptops.csv with the name laptops.

Make sure to run and then save the notebook before selecting Test Work!

3.

Preview the first five lines of the imported laptops dataset.

Make sure to run and then save the notebook before selecting Test Work!

Take this course for free

Mini Info Outline Icon
By signing up for Codecademy, you agree to Codecademy's Terms of Service & Privacy Policy.

Or sign up using:

Already have an account?