MechanicalSoup is a Python library for automating interaction with websites. It provides a simple API for navigating and manipulating web pages. However, MechanicalSoup itself does not have built-in support for proxies. But since MechanicalSoup is built on top of the requests
library, you can use requests
' capability to handle proxies.
Here's how you can use proxies with MechanicalSoup:
Firstly, you need to install MechanicalSoup if you haven't already:
pip install MechanicalSoup
Now, you can configure proxies for your session with MechanicalSoup as you would do with a requests
session:
import mechanicalsoup
# Define your proxy dictionary
proxies = {
"http": "http://10.10.1.10:3128",
"https": "http://10.10.1.10:1080",
}
# Create a StatefulBrowser instance with the desired proxy settings
browser = mechanicalsoup.StatefulBrowser(
session_adapters={'http': 'requests.adapters.HTTPAdapter',
'https': 'requests.adapters.HTTPAdapter'}
)
# Assign the proxies to the session
browser.session.proxies.update(proxies)
# Now you can use the browser object as usual, and it will route through the proxy
browser.open("http://example.com")
In the above example, replace the proxy addresses and ports with the ones you intend to use. The proxies
dictionary needs to be formatted as per the requests
module's requirements.
Please note that if the proxy requires authentication, you have to include the credentials in the proxy URL:
proxies = {
"http": "http://user:password@10.10.1.10:3128",
"https": "http://user:password@10.10.1.10:1080",
}
Always make sure you are compliant with the website's terms of service when using MechanicalSoup with proxies for web scraping. Some websites do not allow web scraping or the use of proxies, and doing so could result in your IP being blocked or other legal consequences.