What are some common user agents that mimic real browsers for scraping Amazon?

When scraping websites like Amazon, it's essential to mimic a real browser to avoid detection and potential blocking, as web scraping can be against the terms of service of many websites. User agents are one of the key pieces of information that your web scraper sends to a web server to identify the type of device and browser making the request.

Amazon, like many other websites, is very sophisticated in detecting bots and scrapers, so using a common, up-to-date web browser user agent is crucial. Below are some user agents that mimic real browsers as of my last update:

Common User Agents for Web Scraping

  1. Google Chrome on Windows 10: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.102 Safari/537.36

  2. Mozilla Firefox on Windows 10: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:96.0) Gecko/20100101 Firefox/96.0

  3. Apple Safari on macOS: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/14.0.3 Safari/605.1.15

  4. Microsoft Edge on Windows 10: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.82 Safari/537.36 Edg/98.0.1108.55

  5. Opera on Windows 10: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.102 Safari/537.36 OPR/82.0.4227.33

  6. Google Chrome on Android: Mozilla/5.0 (Linux; Android 12; Pixel 5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.87 Mobile Safari/537.36

  7. Apple Safari on iOS (iPhone): Mozilla/5.0 (iPhone; CPU iPhone OS 15_3_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.3 Mobile/15E148 Safari/604.1

How to Use a User Agent in Python

When scraping with Python, you can use the requests library and set the User-Agent header to mimic a real browser. Here is an example:

import requests

url = 'https://www.amazon.com'

headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.102 Safari/537.36'
}

response = requests.get(url, headers=headers)

# Now you can process the response content

How to Use a User Agent in JavaScript

In JavaScript, if you are using Node.js with a package like axios, you can also set the User-Agent in the request headers:

const axios = require('axios');

const url = 'https://www.amazon.com';

const headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.102 Safari/537.36'
};

axios.get(url, { headers })
     .then(response => {
         // Handle the response data
     })
     .catch(error => {
         // Handle the error
     });

Important Considerations

  • Always use the latest user agents, as using an outdated one can be a red flag to websites.
  • Websites like Amazon often require more sophisticated scraping techniques, such as managing sessions, handling cookies, and potentially rotating user agents and IP addresses to avoid detection.
  • Make sure to comply with Amazon's terms of service and applicable laws regarding scraping. Unauthorized scraping can lead to legal issues and permanent bans from the service.

Remember that scraping should be done responsibly and ethically, respecting the target website's terms and conditions as well as legal restrictions.

Related Questions

Get Started Now

WebScraping.AI provides rotating proxies, Chromium rendering and built-in HTML parser for web scraping
Icon