How do I find elements with multiple classes using Beautiful Soup?

In Beautiful Soup, you can find elements with multiple classes by using the .find_all() or .select() methods. The .find_all() method allows you to pass a list of classes to the class_ parameter, while the .select() method allows you to use a CSS selector.

Here's how to use both methods:

Using .find_all() with class_

When using .find_all(), you can specify the classes by passing a list of strings to the class_ parameter. For example, if you're looking for elements with both class1 and class2, you'd do the following:

from bs4 import BeautifulSoup

# Sample HTML content
html_doc = """
<html>
<body>
<div class="class1 class2">Content A</div>
<div class="class1">Content B</div>
<div class="class2">Content C</div>
<div class="class1 class2 class3">Content D</div>
</body>
</html>
"""

# Parse the HTML content
soup = BeautifulSoup(html_doc, 'html.parser')

# Find all elements with both 'class1' and 'class2'
elements_with_classes = soup.find_all(class_=["class1", "class2"])

for element in elements_with_classes:
    print(element)

Using .select() with CSS selectors

The .select() method allows you to use CSS selectors to match elements in the document. To select elements with multiple classes, you can concatenate the class names with dots, just like you would in a CSS file.

from bs4 import BeautifulSoup

# Sample HTML content
html_doc = """
<html>
<body>
<div class="class1 class2">Content A</div>
<div class="class1">Content B</div>
<div class="class2">Content C</div>
<div class="class1 class2 class3">Content D</div>
</body>
</html>
"""

# Parse the HTML content
soup = BeautifulSoup(html_doc, 'html.parser')

# Use CSS selectors to find all elements with both 'class1' and 'class2'
elements_with_classes = soup.select('.class1.class2')

for element in elements_with_classes:
    print(element)

In both cases, the output will be:

<div class="class1 class2">Content A</div>
<div class="class1 class2 class3">Content D</div>

Note that when using the .select() method, there should be no space between class names, as a space would indicate a descendant combinator in CSS.

Keep in mind that the order of the classes does not matter when you're matching elements by class. An element with class="class1 class2" will match the same as an element with class="class2 class1".

Related Questions

Get Started Now

WebScraping.AI provides rotating proxies, Chromium rendering and built-in HTML parser for web scraping
Icon