The first step of any data analysis is importing datasets. We’ve loaded the code to do this in the first cell of the exercise notebook. Let’s break this code down line by line!
Tip: we’ll be using the term syntax
as shorthand for the structure of computer code. When we say “this is the syntax to do xyz”, what we mean is “here are the correct commands in the correct order to do xyz using Python code.”
Importing pandas
Pandas is an extra set of tools in Python called a library. To use these extra tools, we have to tell Python that we’ll be using pandas. The syntax to use any Python library is
import library_name as alias
where
library_name
is the name of the libraryalias
is a nickname or abbreviation we’ll use
Most data scientists use the abbreviation pd
for pandas, so our code is
import pandas as pd
Importing the Dataset
Now that pandas is imported, we can import our dataset using the pandas .read_csv()
function. The syntax to import a CSV dataset is
dataset_name = pd.read_csv('filename.csv')
where
dataset_name
is the name we want to call the dataset in our codefilename.csv
is the CSV file containing the data
In our case, we’ll call the dataset repair
and the file is named repair.csv
. So our code is
repair = pd.read_csv('repair.csv')
Previewing the Dataset
While not strictly part of the import process, it is always a good idea to preview a dataset immediately after importing it. Pandas provides a method for this called .head()
, which displays the first five rows of the dataset. The syntax for .head()
is
dataset_name.head()
where
dataset_name
is whatever we named the dataset when we imported it
Our dataset is named repair
, so we’ll use the code
repair.head()
How to Use Your Jupyter Notebook:
- You can run a cell in the Notebook to the right by placing your cursor in the cell and clicking the
Run
button or theShift
+Enter/Return
keys. - When you are ready to evaluate the code in your Notebook, press the
Save
button at the top of the Notebook or use thecontrol/command
+s
keys before clicking theTest Work
button at the bottom. Be sure to save your solution code in the cell marked## YOUR SOLUTION HERE ##
or it will not be evaluated. - When you are ready to move on, click Next.
Instructions
We’ve already imported and displayed the repair
dataset for you. Below that cell are three cells corresponding to these three checkpoints.
In the first of those cells, import the pandas
library using the alias pd
(note that there won’t be any output displayed).
Make sure to run and then save the notebook before selecting Test Work
!
We’ve created a laptop-specific version of the repair dataset stored in a file named laptops.csv
. Complete the code to import the dataset in laptops.csv
with the name laptops
.
Make sure to run and then save the notebook before selecting Test Work
!
Preview the first five lines of the imported laptops
dataset.
Make sure to run and then save the notebook before selecting Test Work
!