Table of contents

How to Handle Geolocation Settings in Puppeteer?

Geolocation handling is crucial when scraping location-aware websites or testing applications that rely on user location data. Puppeteer provides comprehensive geolocation control through browser context permissions and coordinate overrides, allowing developers to simulate users from different geographical locations.

Understanding Geolocation in Puppeteer

Puppeteer allows you to control geolocation in two primary ways:

  1. Permission Management: Grant or deny geolocation permissions
  2. Coordinate Override: Set specific latitude and longitude coordinates

This functionality is essential for testing location-based features, scraping region-specific content, or bypassing geo-restrictions during automated browsing.

Basic Geolocation Setup

Granting Geolocation Permissions

First, you need to grant geolocation permissions to the page:

const puppeteer = require('puppeteer');

(async () => {
  const browser = await puppeteer.launch({ headless: false });
  const page = await browser.newPage();

  // Grant geolocation permissions
  const context = browser.defaultBrowserContext();
  await context.overridePermissions('https://example.com', ['geolocation']);

  await page.goto('https://example.com');
  await browser.close();
})();

Setting Specific Coordinates

Override the browser's geolocation with custom coordinates:

const puppeteer = require('puppeteer');

(async () => {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();

  // Set geolocation to New York City
  await page.setGeolocation({
    latitude: 40.7128,
    longitude: -74.0060,
    accuracy: 100
  });

  // Grant permissions
  const context = browser.defaultBrowserContext();
  await context.overridePermissions('https://maps.google.com', ['geolocation']);

  await page.goto('https://maps.google.com');
  await browser.close();
})();

Advanced Geolocation Scenarios

Testing Multiple Locations

When scraping location-based content, you might need to test multiple geographical positions:

const puppeteer = require('puppeteer');

const locations = [
  { name: 'New York', lat: 40.7128, lng: -74.0060 },
  { name: 'London', lat: 51.5074, lng: -0.1278 },
  { name: 'Tokyo', lat: 35.6762, lng: 139.6503 },
  { name: 'Sydney', lat: -33.8688, lng: 151.2093 }
];

(async () => {
  const browser = await puppeteer.launch();

  for (const location of locations) {
    const page = await browser.newPage();

    // Set geolocation for current location
    await page.setGeolocation({
      latitude: location.lat,
      longitude: location.lng,
      accuracy: 100
    });

    // Grant permissions
    const context = browser.defaultBrowserContext();
    await context.overridePermissions('https://weather.com', ['geolocation']);

    await page.goto('https://weather.com');

    // Wait for location detection
    await page.waitForTimeout(3000);

    // Extract location-specific data
    const locationData = await page.evaluate(() => {
      return {
        currentLocation: document.querySelector('.current-location')?.textContent,
        temperature: document.querySelector('.temperature')?.textContent
      };
    });

    console.log(`${location.name}:`, locationData);
    await page.close();
  }

  await browser.close();
})();

Handling Geolocation Errors

Implement error handling for geolocation scenarios:

const puppeteer = require('puppeteer');

(async () => {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();

  try {
    // Set geolocation
    await page.setGeolocation({
      latitude: 37.7749,
      longitude: -122.4194,
      accuracy: 100
    });

    // Grant permissions
    const context = browser.defaultBrowserContext();
    await context.overridePermissions('https://example.com', ['geolocation']);

    await page.goto('https://example.com');

    // Wait for geolocation to be processed
    await page.waitForFunction(() => {
      return navigator.geolocation !== undefined;
    });

    // Test geolocation functionality
    const position = await page.evaluate(() => {
      return new Promise((resolve, reject) => {
        navigator.geolocation.getCurrentPosition(
          (position) => {
            resolve({
              latitude: position.coords.latitude,
              longitude: position.coords.longitude,
              accuracy: position.coords.accuracy
            });
          },
          (error) => {
            reject(error.message);
          }
        );
      });
    });

    console.log('Geolocation detected:', position);

  } catch (error) {
    console.error('Geolocation error:', error);
  }

  await browser.close();
})();

Working with Geolocation APIs

Simulating Location Change

Simulate dynamic location changes during browsing:

const puppeteer = require('puppeteer');

(async () => {
  const browser = await puppeteer.launch({ headless: false });
  const page = await browser.newPage();

  // Initial location: San Francisco
  await page.setGeolocation({
    latitude: 37.7749,
    longitude: -122.4194,
    accuracy: 100
  });

  const context = browser.defaultBrowserContext();
  await context.overridePermissions('https://maps.google.com', ['geolocation']);

  await page.goto('https://maps.google.com');
  await page.waitForTimeout(3000);

  // Change location to Los Angeles
  await page.setGeolocation({
    latitude: 34.0522,
    longitude: -118.2437,
    accuracy: 100
  });

  // Trigger location refresh
  await page.evaluate(() => {
    navigator.geolocation.getCurrentPosition(() => {
      window.location.reload();
    });
  });

  await page.waitForTimeout(3000);
  await browser.close();
})();

Location-Based Content Scraping

Extract content that varies by geographical location:

const puppeteer = require('puppeteer');

async function scrapeLocationContent(lat, lng, url) {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();

  // Set geolocation
  await page.setGeolocation({
    latitude: lat,
    longitude: lng,
    accuracy: 100
  });

  // Grant permissions
  const context = browser.defaultBrowserContext();
  await context.overridePermissions(url, ['geolocation']);

  await page.goto(url);
  await page.waitForTimeout(2000);

  // Extract location-specific content
  const content = await page.evaluate(() => {
    return {
      title: document.title,
      locationInfo: document.querySelector('.location-info')?.textContent,
      prices: Array.from(document.querySelectorAll('.price')).map(el => el.textContent),
      availability: document.querySelector('.availability')?.textContent
    };
  });

  await browser.close();
  return content;
}

// Usage
(async () => {
  const newYorkContent = await scrapeLocationContent(40.7128, -74.0060, 'https://example-store.com');
  const londonContent = await scrapeLocationContent(51.5074, -0.1278, 'https://example-store.com');

  console.log('New York Content:', newYorkContent);
  console.log('London Content:', londonContent);
})();

Best Practices for Geolocation Handling

1. Always Grant Permissions First

// Always grant geolocation permissions before setting coordinates
const context = browser.defaultBrowserContext();
await context.overridePermissions(url, ['geolocation']);
await page.setGeolocation({ latitude: lat, longitude: lng, accuracy: 100 });

2. Use Realistic Accuracy Values

// Use realistic accuracy values (10-100 meters for GPS)
await page.setGeolocation({
  latitude: 40.7128,
  longitude: -74.0060,
  accuracy: 50 // 50 meters accuracy
});

3. Handle Timing Issues

// Wait for geolocation to be processed
await page.waitForFunction(() => {
  return typeof navigator.geolocation !== 'undefined';
});

4. Test Permission Scenarios

// Test both granted and denied permissions
await context.overridePermissions(url, []); // Deny all permissions
await context.overridePermissions(url, ['geolocation']); // Grant geolocation

Common Geolocation Challenges

Permission Handling

Some websites require explicit permission handling:

// Listen for permission requests
page.on('dialog', async (dialog) => {
  if (dialog.message().includes('location')) {
    await dialog.accept();
  }
});

Coordinate Validation

Validate coordinates before setting them:

function validateCoordinates(lat, lng) {
  return lat >= -90 && lat <= 90 && lng >= -180 && lng <= 180;
}

if (validateCoordinates(latitude, longitude)) {
  await page.setGeolocation({ latitude, longitude, accuracy: 100 });
}

Integration with Testing Frameworks

Jest Integration

const puppeteer = require('puppeteer');

describe('Geolocation Tests', () => {
  let browser, page;

  beforeEach(async () => {
    browser = await puppeteer.launch();
    page = await browser.newPage();
  });

  afterEach(async () => {
    await browser.close();
  });

  test('should detect New York location', async () => {
    await page.setGeolocation({
      latitude: 40.7128,
      longitude: -74.0060,
      accuracy: 100
    });

    const context = browser.defaultBrowserContext();
    await context.overridePermissions('https://example.com', ['geolocation']);

    await page.goto('https://example.com');

    const location = await page.evaluate(() => {
      return new Promise((resolve) => {
        navigator.geolocation.getCurrentPosition((position) => {
          resolve({
            lat: position.coords.latitude,
            lng: position.coords.longitude
          });
        });
      });
    });

    expect(location.lat).toBeCloseTo(40.7128, 4);
    expect(location.lng).toBeCloseTo(-74.0060, 4);
  });
});

Python Implementation with pyppeteer

For Python developers, here's how to handle geolocation using pyppeteer:

import asyncio
from pyppeteer import launch

async def set_geolocation_python():
    browser = await launch()
    page = await browser.newPage()

    # Set geolocation
    await page.setGeolocation({
        'latitude': 40.7128,
        'longitude': -74.0060,
        'accuracy': 100
    })

    # Grant permissions
    context = browser.defaultBrowserContext()
    await context.overridePermissions('https://example.com', ['geolocation'])

    await page.goto('https://example.com')

    # Get current position
    position = await page.evaluate('''() => {
        return new Promise((resolve) => {
            navigator.geolocation.getCurrentPosition((position) => {
                resolve({
                    'latitude': position.coords.latitude,
                    'longitude': position.coords.longitude
                });
            });
        });
    }''')

    print(f"Position: {position}")
    await browser.close()

# Run the function
asyncio.get_event_loop().run_until_complete(set_geolocation_python())

Conclusion

Handling geolocation in Puppeteer requires understanding both permission management and coordinate override capabilities. By properly configuring geolocation settings, you can effectively scrape location-based content, test geographical features, and simulate users from different regions. Remember to always grant permissions before setting coordinates and handle timing issues appropriately for reliable automation.

For more advanced browser automation techniques, you might want to explore how to handle cookies and sessions in Playwright or learn about emulating different devices in Playwright for comprehensive testing scenarios.

Try WebScraping.AI for Your Web Scraping Needs

Looking for a powerful web scraping solution? WebScraping.AI provides an LLM-powered API that combines Chromium JavaScript rendering with rotating proxies for reliable data extraction.

Key Features:

  • AI-powered extraction: Ask questions about web pages or extract structured data fields
  • JavaScript rendering: Full Chromium browser support for dynamic content
  • Rotating proxies: Datacenter and residential proxies from multiple countries
  • Easy integration: Simple REST API with SDKs for Python, Ruby, PHP, and more
  • Reliable & scalable: Built for developers who need consistent results

Getting Started:

Get page content with AI analysis:

curl "https://api.webscraping.ai/ai/question?url=https://example.com&question=What is the main topic?&api_key=YOUR_API_KEY"

Extract structured data:

curl "https://api.webscraping.ai/ai/fields?url=https://example.com&fields[title]=Page title&fields[price]=Product price&api_key=YOUR_API_KEY"

Try in request builder

Related Questions

Get Started Now

WebScraping.AI provides rotating proxies, Chromium rendering and built-in HTML parser for web scraping
Icon