Beautiful Soup
Learn how to take data that's displayed on websites and put it into Python using the Beautiful Soup library!
StartKey Concepts
Review core concepts you need to learn to master this subject
Beautiful Soup Object and Methods
Beautiful Soup Object and Methods
# This line of code creates a BeautifulSoup object from a webpage:
soup = BeautifulSoup(webpage.content, "html.parser")
# Within the `soup` object, tags can be called by name:
first_div = soup.div
# or by CSS selector:
all_elements_of_header_class = soup.select(".header")
# or by a call to `.find_all`:
all_p_elements = soup.find_all("p")
BeautifulSoup uses a parser to take in the content of a webpage. It provides tree traversal and advanced searching methods. It creates an object from the website contents.
Web Scraping with Beautiful Soup
Lesson 1 of 1
- 1Before we get started, a quick note on prerequisites: This course requires knowledge of Python . Also some understanding of the Python library Pandas will be helpful later on in the lesson, …
- 2When we scrape websites, we have to make sure we are following some guidelines so that we are treating the websites and their owners with respect. Always check a website’s Terms and Conditions bef…
- 4When we printed out all of that HTML from our request, it seemed pretty long and messy. How could we pull out the relevant information from that long string? BeautifulSoup is a Python library that…
- 5BeautifulSoup breaks the HTML page into several types of objects. #### Tags A Tag corresponds to an HTML Tag in the original document. These lines of code: soup = BeautifulSoup(‘ An example di…
- 6To navigate through a tree, we can call the tag names themselves. Imagine we have an HTML page that looks like this: World’s Best Chocolate Chip Cookies Ingredients 1 cup flour …
- 7When we’re telling our Python script what HTML tags to grab, we need to know the structure of the website and what we’re looking for. Many browsers, including Chrome, Firefox, and Safari, have Dev…
- 9Another way to capture your desired elements with the soup object is to use CSS selectors. The .select() method will take in all of the CSS selectors you normally use in a .css file! Search Resul…
- 10When we use BeautifulSoup to select HTML elements, we often want to grab the text inside of the element, so that we can analyze it. We can use .get_text() to retrieve the text inside of whatever ta…
What you'll create
Portfolio projects that showcase your new skills
How you'll master it
Stress-test your knowledge with quizzes that help commit syntax to memory