How do I add authentication credentials to a request with Requests?

When scraping web pages that require authentication, you may need to provide credentials to access the content. The Python requests library allows you to handle various types of authentication with ease. Below are some methods to add authentication credentials to a request using the requests library:

Basic Authentication

For basic HTTP authentication, you can use the auth parameter of the requests method to provide a username and password:

import requests
from requests.auth import HTTPBasicAuth

url = 'https://example.com/api'
username = 'your_username'
password = 'your_password'

response = requests.get(url, auth=HTTPBasicAuth(username, password))

# Or you can simply pass the credentials as a tuple
response = requests.get(url, auth=(username, password))

print(response.text)

Digest Authentication

If the server uses digest authentication, use the HTTPDigestAuth class:

import requests
from requests.auth import HTTPDigestAuth

url = 'https://example.com/api'
username = 'your_username'
password = 'your_password'

response = requests.get(url, auth=HTTPDigestAuth(username, password))

print(response.text)

OAuth

For services that use OAuth for authentication, you will need to obtain an access token and include it in the headers of your request. Here's a basic example:

import requests

url = 'https://example.com/api'
access_token = 'your_access_token'

headers = {
    'Authorization': f'Bearer {access_token}'
}

response = requests.get(url, headers=headers)

print(response.text)

Custom Authentication

If you have a custom authentication scheme, you can define your own authentication class by inheriting from requests.auth.AuthBase:

import requests
from requests.auth import AuthBase

class CustomAuth(AuthBase):
    def __init__(self, token):
        self.token = token

    def __call__(self, r):
        # Modify and return the request
        r.headers['Authorization'] = f'Token {self.token}'
        return r

url = 'https://example.com/api'
token = 'your_custom_token'

response = requests.get(url, auth=CustomAuth(token))

print(response.text)

Session Objects

If you need to persist certain parameters across requests, use a session object. This is especially useful for cookies, as they will be managed automatically between requests made using the session:

import requests
from requests.auth import HTTPBasicAuth

url = 'https://example.com/api'
username = 'your_username'
password = 'your_password'

# Create a session object
session = requests.Session()

# Set up authentication
session.auth = (username, password)

# Make a request
response = session.get(url)

print(response.text)

# The session object will persist the authentication for future requests
another_response = session.get('https://example.com/another_api')
print(another_response.text)

Remember to handle your credentials securely and never hardcode them directly into your scripts. Consider using environment variables or secure credential storage solutions. Additionally, make sure that you are complying with the terms of service of the website you are scraping and that you are not violating any laws or regulations.

Related Questions

Get Started Now

WebScraping.AI provides rotating proxies, Chromium rendering and built-in HTML parser for web scraping
Icon