We now know how to create our own DataFrame. However, most of the time, we’ll be working with datasets that already exist. One of the most common formats for big datasets is the CSV.
CSV (comma separated values) is a text-only spreadsheet format. You can find CSVs in lots of places:
- Online datasets (here’s an example from data.gov)
- Export from Excel or Google Sheets
- Export from SQL
The first row of a CSV contains column headings. All subsequent rows contain values. Each column heading and each variable is separated by a comma:
That example CSV represents the following table:
You run a cupcake store and want to create a record of all of the cupcakes that you offer.
Write the following data as a CSV in
|Chocolate Cake||chocolate||chocolate||chocolate shavings|
|Birthday Cake||vanilla||vanilla||rainbow sprinkles|
|Carrot Cake||carrot||cream cheese||almonds|