Table of contents

How do I scrape data from mobile-responsive websites using Selenium?

Scraping mobile-responsive websites with Selenium requires specific techniques to handle different screen sizes, touch interactions, and mobile-specific content. Mobile-responsive sites often display different layouts, navigation patterns, and even different content based on the device viewport. This guide covers comprehensive strategies for successfully scraping these adaptive websites.

Understanding Mobile-Responsive Challenges

Mobile-responsive websites present unique challenges for web scraping:

  • Dynamic layouts: Content positioning changes based on screen size
  • Hidden elements: Desktop navigation may be replaced with hamburger menus
  • Touch interactions: Some elements only respond to touch events
  • Progressive loading: Content may load differently on mobile devices
  • Media queries: CSS behavior varies based on viewport dimensions

Setting Up Mobile Device Emulation

Chrome Mobile Emulation

The most effective approach is to configure Chrome to emulate mobile devices:

from selenium import webdriver
from selenium.webdriver.chrome.options import Options

def setup_mobile_chrome_driver():
    chrome_options = Options()

    # Enable mobile emulation
    mobile_emulation = {
        "deviceMetrics": {
            "width": 375,
            "height": 667,
            "pixelRatio": 2.0
        },
        "userAgent": "Mozilla/5.0 (iPhone; CPU iPhone OS 14_7_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/14.1.2 Mobile/15E148 Safari/604.1"
    }

    chrome_options.add_experimental_option("mobileEmulation", mobile_emulation)
    chrome_options.add_argument("--no-sandbox")
    chrome_options.add_argument("--disable-dev-shm-usage")

    driver = webdriver.Chrome(options=chrome_options)
    return driver

# Usage
driver = setup_mobile_chrome_driver()
driver.get("https://example.com")

Predefined Device Emulation

Chrome also supports predefined device profiles:

def setup_iphone_emulation():
    chrome_options = Options()

    # Use predefined device
    mobile_emulation = {"deviceName": "iPhone 12 Pro"}
    chrome_options.add_experimental_option("mobileEmulation", mobile_emulation)

    driver = webdriver.Chrome(options=chrome_options)
    return driver

def setup_android_emulation():
    chrome_options = Options()

    # Android device emulation
    mobile_emulation = {"deviceName": "Pixel 5"}
    chrome_options.add_experimental_option("mobileEmulation", mobile_emulation)

    driver = webdriver.Chrome(options=chrome_options)
    return driver

JavaScript Implementation

For Node.js applications using Selenium WebDriver:

const { Builder } = require('selenium-webdriver');
const chrome = require('selenium-webdriver/chrome');

async function setupMobileChrome() {
    const mobileEmulation = {
        deviceMetrics: {
            width: 375,
            height: 667,
            pixelRatio: 2.0
        },
        userAgent: 'Mozilla/5.0 (iPhone; CPU iPhone OS 14_7_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/14.1.2 Mobile/15E148 Safari/604.1'
    };

    const options = new chrome.Options();
    options.setMobileEmulation(mobileEmulation);
    options.addArguments('--no-sandbox');
    options.addArguments('--disable-dev-shm-usage');

    const driver = await new Builder()
        .forBrowser('chrome')
        .setChromeOptions(options)
        .build();

    return driver;
}

// Usage
async function scrapeWithMobileEmulation() {
    const driver = await setupMobileChrome();

    try {
        await driver.get('https://example.com');
        // Perform scraping operations
        const elements = await driver.findElements(By.css('.mobile-specific-class'));

        for (let element of elements) {
            const text = await element.getText();
            console.log(text);
        }
    } finally {
        await driver.quit();
    }
}

Handling Mobile-Specific UI Elements

Managing Hamburger Menus

Mobile sites often use hamburger menus instead of traditional navigation:

from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

def handle_mobile_navigation(driver):
    try:
        # Wait for hamburger menu to be clickable
        hamburger_menu = WebDriverWait(driver, 10).until(
            EC.element_to_be_clickable((By.CSS_SELECTOR, ".hamburger-menu, .menu-toggle, .navbar-toggler"))
        )

        # Click to open menu
        hamburger_menu.click()

        # Wait for menu items to be visible
        WebDriverWait(driver, 5).until(
            EC.visibility_of_element_located((By.CSS_SELECTOR, ".mobile-menu, .nav-menu"))
        )

        # Extract menu items
        menu_items = driver.find_elements(By.CSS_SELECTOR, ".mobile-menu a, .nav-menu a")

        for item in menu_items:
            print(f"Menu item: {item.text} - URL: {item.get_attribute('href')}")

    except Exception as e:
        print(f"Error handling mobile navigation: {e}")

Touch Interactions

Some mobile elements require touch events instead of regular clicks:

from selenium.webdriver.common.action_chains import ActionChains

def perform_touch_interaction(driver, element):
    # Create action chain for touch-like interaction
    actions = ActionChains(driver)

    # Perform touch tap
    actions.move_to_element(element).click().perform()

    # Alternative: Use JavaScript for touch events
    driver.execute_script("""
        var element = arguments[0];
        var touchEvent = new TouchEvent('touchstart', {
            bubbles: true,
            cancelable: true,
            touches: [new Touch({
                identifier: 0,
                target: element,
                clientX: element.offsetLeft,
                clientY: element.offsetTop
            })]
        });
        element.dispatchEvent(touchEvent);
    """, element)

Responsive Viewport Testing

Test multiple viewport sizes to ensure comprehensive data collection:

def scrape_multiple_viewports(url):
    viewports = [
        {"width": 320, "height": 568, "name": "iPhone 5"},
        {"width": 375, "height": 667, "name": "iPhone 6/7/8"},
        {"width": 414, "height": 896, "name": "iPhone 11"},
        {"width": 360, "height": 640, "name": "Android Small"},
        {"width": 768, "height": 1024, "name": "Tablet"}
    ]

    results = {}

    for viewport in viewports:
        chrome_options = Options()
        mobile_emulation = {
            "deviceMetrics": {
                "width": viewport["width"],
                "height": viewport["height"],
                "pixelRatio": 2.0
            },
            "userAgent": "Mozilla/5.0 (Linux; Android 10; SM-A205U) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.120 Mobile Safari/537.36"
        }

        chrome_options.add_experimental_option("mobileEmulation", mobile_emulation)
        driver = webdriver.Chrome(options=chrome_options)

        try:
            driver.get(url)

            # Wait for content to load
            WebDriverWait(driver, 10).until(
                EC.presence_of_element_located((By.TAG_NAME, "body"))
            )

            # Extract data specific to this viewport
            elements = driver.find_elements(By.CSS_SELECTOR, ".content-item")
            viewport_data = [elem.text for elem in elements]

            results[viewport["name"]] = viewport_data

        except Exception as e:
            print(f"Error scraping {viewport['name']}: {e}")
            results[viewport["name"]] = []
        finally:
            driver.quit()

    return results

Handling Progressive Loading

Mobile sites often use progressive loading techniques that require special handling:

def handle_progressive_loading(driver):
    # Scroll to trigger lazy loading
    last_height = driver.execute_script("return document.body.scrollHeight")

    while True:
        # Scroll to bottom
        driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")

        # Wait for new content to load
        WebDriverWait(driver, 5).until(
            lambda d: d.execute_script("return document.body.scrollHeight") > last_height
        )

        new_height = driver.execute_script("return document.body.scrollHeight")
        if new_height == last_height:
            break
        last_height = new_height

    # Extract all loaded content
    all_items = driver.find_elements(By.CSS_SELECTOR, ".item, .post, .product")
    return [item.text for item in all_items]

Advanced Mobile Scraping Techniques

Network Throttling

Simulate mobile network conditions:

def setup_network_throttling(driver):
    # Enable network throttling to simulate mobile connections
    driver.execute_cdp_cmd('Network.emulateNetworkConditions', {
        'offline': False,
        'downloadThroughput': 1.5 * 1024 * 1024 / 8,  # 1.5 Mbps
        'uploadThroughput': 750 * 1024 / 8,  # 750 Kbps
        'latency': 40  # 40ms latency
    })

Handling Swipe Gestures

For carousel or swipeable content:

def simulate_swipe(driver, element, direction="left"):
    # Get element dimensions
    size = element.size
    location = element.location

    start_x = location['x'] + size['width'] * 0.8
    start_y = location['y'] + size['height'] * 0.5

    if direction == "left":
        end_x = location['x'] + size['width'] * 0.2
    else:
        end_x = location['x'] + size['width'] * 0.8

    end_y = start_y

    # Perform swipe action
    actions = ActionChains(driver)
    actions.move_to_element_with_offset(element, start_x - location['x'], start_y - location['y'])
    actions.click_and_hold()
    actions.move_by_offset(end_x - start_x, end_y - start_y)
    actions.release()
    actions.perform()

Best Practices for Mobile Scraping

1. Responsive Design Testing

Always test your scraping logic across multiple screen sizes, as mobile-responsive sites may show different content based on viewport dimensions. Similar to how you might handle browser sessions in Puppeteer, maintaining consistent session state across different mobile viewports is crucial.

2. Wait Strategies

Mobile sites often have longer loading times and progressive content loading:

def wait_for_mobile_content(driver, timeout=15):
    # Wait for initial content
    WebDriverWait(driver, timeout).until(
        EC.presence_of_element_located((By.CSS_SELECTOR, "main, .content, #main"))
    )

    # Wait for images to load
    WebDriverWait(driver, timeout).until(
        lambda d: d.execute_script("""
            return Array.from(document.images).every(img => img.complete);
        """)
    )

    # Wait for any lazy-loaded content
    time.sleep(2)

3. Error Handling

Implement robust error handling for mobile-specific issues:

def robust_mobile_scraping(driver, url):
    max_retries = 3

    for attempt in range(max_retries):
        try:
            driver.get(url)

            # Handle potential mobile redirects
            if "m." in driver.current_url or "mobile" in driver.current_url:
                print(f"Mobile redirect detected: {driver.current_url}")

            # Wait for content to be ready
            wait_for_mobile_content(driver)

            # Extract data
            data = extract_mobile_data(driver)
            return data

        except Exception as e:
            print(f"Attempt {attempt + 1} failed: {e}")
            if attempt == max_retries - 1:
                raise
            time.sleep(2)

Performance Optimization

Resource Management

Mobile emulation can be resource-intensive. Optimize performance:

def optimize_mobile_driver():
    chrome_options = Options()

    # Mobile emulation
    mobile_emulation = {"deviceName": "iPhone 12 Pro"}
    chrome_options.add_experimental_option("mobileEmulation", mobile_emulation)

    # Performance optimizations
    chrome_options.add_argument("--disable-images")
    chrome_options.add_argument("--disable-javascript")  # If JS not needed
    chrome_options.add_argument("--disable-plugins")
    chrome_options.add_argument("--disable-extensions")
    chrome_options.add_argument("--no-sandbox")
    chrome_options.add_argument("--disable-dev-shm-usage")

    # Reduce memory usage
    chrome_options.add_argument("--memory-pressure-off")
    chrome_options.add_argument("--max_old_space_size=4096")

    return webdriver.Chrome(options=chrome_options)

Common Pitfalls and Solutions

Viewport Detection Issues

Some sites use JavaScript to detect viewport size. Ensure proper timing:

def ensure_viewport_detection(driver):
    # Trigger resize event to ensure proper viewport detection
    driver.execute_script("""
        window.dispatchEvent(new Event('resize'));
        window.dispatchEvent(new Event('orientationchange'));
    """)

    # Wait for layout to settle
    time.sleep(1)

Content Differences

Mobile sites may show different content. Compare desktop vs mobile results:

def compare_desktop_mobile_content(url):
    # Desktop scraping
    desktop_driver = webdriver.Chrome()
    desktop_driver.get(url)
    desktop_content = desktop_driver.find_elements(By.CSS_SELECTOR, ".content-item")
    desktop_data = [elem.text for elem in desktop_content]
    desktop_driver.quit()

    # Mobile scraping
    mobile_driver = setup_mobile_chrome_driver()
    mobile_driver.get(url)
    mobile_content = mobile_driver.find_elements(By.CSS_SELECTOR, ".content-item")
    mobile_data = [elem.text for elem in mobile_content]
    mobile_driver.quit()

    # Compare results
    print(f"Desktop items: {len(desktop_data)}")
    print(f"Mobile items: {len(mobile_data)}")

    return {"desktop": desktop_data, "mobile": mobile_data}

Testing Mobile-Specific Features

Orientation Changes

Handle device orientation changes:

def test_orientation_changes(driver):
    # Portrait mode (default)
    driver.execute_script("""
        window.screen.orientation.lock('portrait');
    """)

    # Extract portrait data
    portrait_data = extract_data(driver)

    # Landscape mode
    driver.execute_script("""
        window.screen.orientation.lock('landscape');
    """)

    # Extract landscape data
    landscape_data = extract_data(driver)

    return {"portrait": portrait_data, "landscape": landscape_data}

Touch Events Simulation

Simulate complex touch interactions:

def simulate_pinch_zoom(driver, element, scale_factor=1.5):
    # Simulate pinch-to-zoom gesture
    driver.execute_script("""
        var element = arguments[0];
        var scale = arguments[1];

        // Create touch points
        var touch1 = new Touch({
            identifier: 0,
            target: element,
            clientX: element.offsetLeft + element.offsetWidth * 0.3,
            clientY: element.offsetTop + element.offsetHeight * 0.3
        });

        var touch2 = new Touch({
            identifier: 1,
            target: element,
            clientX: element.offsetLeft + element.offsetWidth * 0.7,
            clientY: element.offsetTop + element.offsetHeight * 0.7
        });

        // Fire touch events
        element.dispatchEvent(new TouchEvent('touchstart', {
            touches: [touch1, touch2]
        }));

        // Simulate pinch movement
        setTimeout(() => {
            element.dispatchEvent(new TouchEvent('touchend', {
                touches: []
            }));
        }, 100);
    """, element, scale_factor)

Conclusion

Scraping mobile-responsive websites with Selenium requires careful consideration of viewport settings, mobile-specific UI patterns, and progressive loading behaviors. By implementing proper device emulation, handling mobile navigation patterns, and using appropriate wait strategies, you can effectively extract data from responsive websites across different screen sizes.

Remember to test your scraping logic across multiple viewport sizes and device types to ensure comprehensive data collection. Just as you would handle AJAX requests using Puppeteer, managing asynchronous content loading on mobile devices requires patience and robust error handling.

The key to successful mobile scraping lies in understanding how responsive design affects content presentation and adapting your scraping strategy accordingly. With the techniques outlined in this guide, you'll be well-equipped to handle the unique challenges of mobile-responsive web scraping.

Try WebScraping.AI for Your Web Scraping Needs

Looking for a powerful web scraping solution? WebScraping.AI provides an LLM-powered API that combines Chromium JavaScript rendering with rotating proxies for reliable data extraction.

Key Features:

  • AI-powered extraction: Ask questions about web pages or extract structured data fields
  • JavaScript rendering: Full Chromium browser support for dynamic content
  • Rotating proxies: Datacenter and residential proxies from multiple countries
  • Easy integration: Simple REST API with SDKs for Python, Ruby, PHP, and more
  • Reliable & scalable: Built for developers who need consistent results

Getting Started:

Get page content with AI analysis:

curl "https://api.webscraping.ai/ai/question?url=https://example.com&question=What is the main topic?&api_key=YOUR_API_KEY"

Extract structured data:

curl "https://api.webscraping.ai/ai/fields?url=https://example.com&fields[title]=Page title&fields[price]=Product price&api_key=YOUR_API_KEY"

Try in request builder

Related Questions

Get Started Now

WebScraping.AI provides rotating proxies, Chromium rendering and built-in HTML parser for web scraping
Icon