To select multiple elements using CSS selectors, you can combine selectors in different ways depending on the elements you want to target. Here are the most common methods:
- Comma-separated list: To select all elements that match any of several selectors, separate the selectors with commas. This is equivalent to the logical OR operator.
/* Selects all <h1> and <h2> elements */
h1, h2 {
color: blue;
}
- Descendant selector: To select all elements that are descendants of a specified element, use a space between two or more selectors.
/* Selects all <p> elements inside <div> */
div p {
font-size: 16px;
}
- Child selector: To select all elements that are direct children of a specified element, use the greater-than sign (
>
) between two selectors.
/* Selects all <p> elements that are direct children of <div> */
div > p {
margin-left: 20px;
}
- Adjacent sibling selector: To select an element that is immediately preceded by a specific element, use the plus sign (
+
) between two selectors.
/* Selects the first <p> element immediately after any <h2> */
h2 + p {
font-weight: bold;
}
- General sibling selector: To select elements that are siblings of a specified element, use the tilde sign (
~
) between two selectors.
/* Selects all <p> elements that are siblings of <h2> and come after <h2> */
h2 ~ p {
text-decoration: underline;
}
When you use these selectors for web scraping in Python with libraries such as Beautiful Soup or in JavaScript with libraries such as Cheerio or the browser's built-in querySelectorAll
method, you can target multiple elements in the DOM tree.
Examples in Python with Beautiful Soup:
from bs4 import BeautifulSoup
html = """
<div>
<h1>Title</h1>
<p>Description</p>
<h2>Subtitle</h2>
<p>Detail 1</p>
<p>Detail 2</p>
</div>
"""
# Create a Beautiful Soup object
soup = BeautifulSoup(html, 'html.parser')
# Select all <h1> and <h2> elements
headers = soup.select('h1, h2')
# Select all <p> elements that are descendants of <div>
paragraphs = soup.select('div p')
# Print the text of selected elements
for header in headers:
print(header.text)
for paragraph in paragraphs:
print(paragraph.text)
Examples in JavaScript:
In a browser environment, you can use document.querySelectorAll
to select multiple elements.
// Select all <h1> and <h2> elements
const headers = document.querySelectorAll('h1, h2');
// Select all <p> elements that are descendants of <div>
const paragraphs = document.querySelectorAll('div p');
// Log the text of selected elements
headers.forEach(header => {
console.log(header.textContent);
});
paragraphs.forEach(paragraph => {
console.log(paragraph.textContent);
});
In Node.js, you would typically use a library like Cheerio:
const cheerio = require('cheerio');
const html = `
<div>
<h1>Title</h1>
<p>Description</p>
<h2>Subtitle</h2>
<p>Detail 1</p>
<p>Detail 2</p>
</div>
`;
const $ = cheerio.load(html);
// Select all <h1> and <h2> elements
const headers = $('h1, h2');
// Select all <p> elements that are descendants of <div>
const paragraphs = $('div p');
// Log the text of selected elements
headers.each(function() {
console.log($(this).text());
});
paragraphs.each(function() {
console.log($(this).text());
});
Using these techniques, you can scrape data from a webpage by targeting the necessary elements with CSS selectors.