Table of contents

How do I find elements with multiple classes using Beautiful Soup?

Beautiful Soup provides several methods to find elements with multiple classes. The key distinction is between finding elements that have all specified classes versus elements that have any of the specified classes.

Method 1: Using .find_all() for Elements with ANY Classes

The .find_all() method with a list of classes will match elements that have any of the specified classes:

from bs4 import BeautifulSoup

html_doc = """
<html>
<body>
<div class="btn primary">Button 1</div>
<div class="btn secondary">Button 2</div>
<div class="btn primary large">Button 3</div>
<div class="primary">Not a button</div>
<div class="text-content">Some text</div>
</body>
</html>
"""

soup = BeautifulSoup(html_doc, 'html.parser')

# Finds elements with 'btn' OR 'primary' class
elements = soup.find_all(class_=["btn", "primary"])
print(f"Found {len(elements)} elements with any of the classes")

Method 2: Using .select() for Elements with ALL Classes

CSS selectors are perfect for finding elements that have all specified classes:

# Find elements with BOTH 'btn' AND 'primary' classes
elements = soup.select('.btn.primary')

for element in elements:
    print(element.get_text())
    print(f"Classes: {element.get('class')}")

Output: Button 1 Classes: ['btn', 'primary'] Button 3 Classes: ['btn', 'primary', 'large']

Method 3: Advanced CSS Selectors

You can use more complex CSS selectors for sophisticated matching:

# Multiple class combinations
soup.select('.btn.primary, .btn.secondary')  # Elements with (btn AND primary) OR (btn AND secondary)

# Class with specific tag
soup.select('div.btn.primary')  # Only div elements with both classes

# Descendant selectors
soup.select('.container .btn.primary')  # .btn.primary inside .container

Method 4: Custom Function for Exact Class Matching

For precise control, create a custom function:

def has_all_classes(tag, classes):
    """Check if tag has all specified classes"""
    if not tag.get('class'):
        return False
    tag_classes = set(tag.get('class'))
    return set(classes).issubset(tag_classes)

def has_exact_classes(tag, classes):
    """Check if tag has exactly these classes (no more, no less)"""
    if not tag.get('class'):
        return False
    return set(tag.get('class')) == set(classes)

# Find elements with both 'btn' and 'primary' classes
elements_with_both = soup.find_all(lambda tag: has_all_classes(tag, ['btn', 'primary']))

# Find elements with exactly 'btn' and 'primary' classes (no additional classes)
elements_exact = soup.find_all(lambda tag: has_exact_classes(tag, ['btn', 'primary']))

Real-World Example: Bootstrap Components

html = """
<div class="card border-primary">
<div class="card-header bg-primary text-white">Header</div>
<div class="card-body">
    <button class="btn btn-primary btn-lg">Large Primary Button</button>
    <button class="btn btn-secondary">Secondary Button</button>
    <button class="btn btn-primary disabled">Disabled Primary</button>
</div>
</div>
"""

soup = BeautifulSoup(html, 'html.parser')

# Find all primary buttons (both classes must be present)
primary_buttons = soup.select('button.btn.btn-primary')
print(f"Found {len(primary_buttons)} primary buttons")

# Find buttons with specific size and style
large_primary = soup.select('.btn.btn-primary.btn-lg')
disabled_primary = soup.select('.btn.btn-primary.disabled')

Key Points to Remember

  1. .find_all(class_=["a", "b"]) finds elements with class "a" OR class "b"
  2. .select('.a.b') finds elements with class "a" AND class "b"
  3. No spaces between class names in CSS selectors (.class1.class2)
  4. Class order doesn't matter - class="a b" matches class="b a"
  5. Case sensitive - class names must match exactly
  6. Use custom functions for complex matching logic

This gives you complete flexibility to find elements based on their class combinations for effective web scraping.

Try WebScraping.AI for Your Web Scraping Needs

Looking for a powerful web scraping solution? WebScraping.AI provides an LLM-powered API that combines Chromium JavaScript rendering with rotating proxies for reliable data extraction.

Key Features:

  • AI-powered extraction: Ask questions about web pages or extract structured data fields
  • JavaScript rendering: Full Chromium browser support for dynamic content
  • Rotating proxies: Datacenter and residential proxies from multiple countries
  • Easy integration: Simple REST API with SDKs for Python, Ruby, PHP, and more
  • Reliable & scalable: Built for developers who need consistent results

Getting Started:

Get page content with AI analysis:

curl "https://api.webscraping.ai/ai/question?url=https://example.com&question=What is the main topic?&api_key=YOUR_API_KEY"

Extract structured data:

curl "https://api.webscraping.ai/ai/fields?url=https://example.com&fields[title]=Page title&fields[price]=Product price&api_key=YOUR_API_KEY"

Try in request builder

Related Questions

Get Started Now

WebScraping.AI provides rotating proxies, Chromium rendering and built-in HTML parser for web scraping
Icon