StreetEasy is New York City’s leading real estate marketplace — from studios to high-rises, Brooklyn Heights to Harlem.
In this lesson, you will be working with a dataset that contains a sample of 5,000 rentals listings in
Queens, active on StreetEasy in June 2016.
It has the following columns:
rental_id: rental ID
rent: price of rent in dollars
bedrooms: number of bedrooms
bathrooms: number of bathrooms
size_sqft: size in square feet
min_to_subway: distance from subway station in minutes
floor: floor number
building_age_yrs: building’s age in years
no_fee: does it have a broker fee? (0 for fee, 1 for no fee)
has_roofdeck: does it have a roof deck? (0 for no, 1 for yes)
has_washer_dryer: does it have washer/dryer in unit? (0/1)
has_doorman: does it have a doorman? (0/1)
has_elevator: does it have an elevator? (0/1)
has_dishwasher: does it have a dishwasher (0/1)
has_patio: does it have a patio? (0/1)
has_gym: does the building have a gym? (0/1)
neighborhood: (ex: Greenpoint)
borough: (ex: Brooklyn)
More information about this dataset can be found in the StreetEasy Dataset article.
Let’s start by doing exploratory data analysis to understand the dataset better. We have broken the dataset for you into:
First, pick a borough out of the three (
Queens) that you are most interested in!
We are going to import the dataset and store it in a variable called
To import, we will need to run this snippet:
path with one of the three URL’s above.
Let’s take a look at the first few rows using