Table of contents

How to handle browser sessions in Puppeteer?

Browser session management in Puppeteer is essential for maintaining authentication states, user preferences, and stateful interactions across page navigations. This involves managing cookies, localStorage, sessionStorage, and other browser storage mechanisms.

What is a Browser Session?

A browser session consists of: - Cookies: Server-side session tokens and preferences - localStorage: Persistent client-side data - sessionStorage: Temporary client-side data (cleared on tab close) - IndexedDB: Complex client-side database storage - Cache: Stored resources and data

Storing Browser Sessions

1. Saving Cookies

const puppeteer = require('puppeteer');
const fs = require('fs');

async function saveSession() {
    const browser = await puppeteer.launch();
    const page = await browser.newPage();

    // Navigate and perform login
    await page.goto('https://example.com/login');
    await page.type('#username', 'your-username');
    await page.type('#password', 'your-password');
    await page.click('#login-button');
    await page.waitForNavigation();

    // Save cookies to file
    const cookies = await page.cookies();
    fs.writeFileSync('session-cookies.json', JSON.stringify(cookies, null, 2));

    await browser.close();
}

saveSession();

2. Saving All Storage Data

async function saveCompleteSession() {
    const browser = await puppeteer.launch();
    const page = await browser.newPage();
    await page.goto('https://example.com');

    // Get all storage data
    const sessionData = await page.evaluate(() => {
        return {
            cookies: document.cookie,
            localStorage: JSON.stringify(localStorage),
            sessionStorage: JSON.stringify(sessionStorage)
        };
    });

    // Save cookies separately for easier restoration
    const cookies = await page.cookies();

    const completeSession = {
        cookies: cookies,
        storage: sessionData
    };

    fs.writeFileSync('complete-session.json', JSON.stringify(completeSession, null, 2));

    await browser.close();
}

Restoring Browser Sessions

1. Restoring from Cookies

async function restoreSession() {
    const browser = await puppeteer.launch();
    const page = await browser.newPage();

    // Load saved cookies
    const cookies = JSON.parse(fs.readFileSync('session-cookies.json', 'utf8'));

    // Set cookies before navigation
    await page.setCookie(...cookies);

    // Navigate to protected page
    await page.goto('https://example.com/dashboard');

    // Verify session is restored
    const isLoggedIn = await page.$('.user-profile') !== null;
    console.log('Session restored:', isLoggedIn);

    await browser.close();
}

2. Restoring Complete Session

async function restoreCompleteSession() {
    const browser = await puppeteer.launch();
    const page = await browser.newPage();

    // Load complete session data
    const sessionData = JSON.parse(fs.readFileSync('complete-session.json', 'utf8'));

    // Set cookies
    await page.setCookie(...sessionData.cookies);

    // Navigate to the page first
    await page.goto('https://example.com');

    // Restore localStorage and sessionStorage
    await page.evaluate((storage) => {
        if (storage.localStorage) {
            const localData = JSON.parse(storage.localStorage);
            for (const [key, value] of Object.entries(localData)) {
                localStorage.setItem(key, value);
            }
        }

        if (storage.sessionStorage) {
            const sessionData = JSON.parse(storage.sessionStorage);
            for (const [key, value] of Object.entries(sessionData)) {
                sessionStorage.setItem(key, value);
            }
        }
    }, sessionData.storage);

    // Refresh to apply restored data
    await page.reload();

    await browser.close();
}

Clearing Browser Sessions

1. Clear Specific Cookies

async function clearSpecificCookies() {
    const browser = await puppeteer.launch();
    const page = await browser.newPage();
    await page.goto('https://example.com');

    // Get current cookies
    const cookies = await page.cookies();

    // Clear specific cookies (e.g., session cookies)
    const sessionCookies = cookies.filter(cookie => 
        cookie.name.includes('session') || cookie.name.includes('auth')
    );

    if (sessionCookies.length > 0) {
        await page.deleteCookie(...sessionCookies);
    }

    await browser.close();
}

2. Clear All Session Data

async function clearAllSessionData() {
    const browser = await puppeteer.launch();
    const page = await browser.newPage();
    await page.goto('https://example.com');

    // Clear all cookies
    const cookies = await page.cookies();
    if (cookies.length > 0) {
        await page.deleteCookie(...cookies);
    }

    // Clear all storage
    await page.evaluate(() => {
        // Clear localStorage
        localStorage.clear();

        // Clear sessionStorage
        sessionStorage.clear();

        // Clear IndexedDB (if needed)
        if (window.indexedDB) {
            indexedDB.databases().then(databases => {
                databases.forEach(db => {
                    indexedDB.deleteDatabase(db.name);
                });
            });
        }
    });

    await browser.close();
}

Advanced Session Management

1. Session Manager Class

class PuppeteerSessionManager {
    constructor(sessionFilePath = 'session.json') {
        this.sessionFilePath = sessionFilePath;
    }

    async saveSession(page) {
        const cookies = await page.cookies();
        const storage = await page.evaluate(() => ({
            localStorage: JSON.stringify(localStorage),
            sessionStorage: JSON.stringify(sessionStorage)
        }));

        const sessionData = {
            cookies,
            storage,
            timestamp: Date.now()
        };

        fs.writeFileSync(this.sessionFilePath, JSON.stringify(sessionData, null, 2));
    }

    async loadSession(page) {
        if (!fs.existsSync(this.sessionFilePath)) {
            return false;
        }

        const sessionData = JSON.parse(fs.readFileSync(this.sessionFilePath, 'utf8'));

        // Check if session is expired (24 hours)
        if (Date.now() - sessionData.timestamp > 24 * 60 * 60 * 1000) {
            return false;
        }

        // Set cookies
        if (sessionData.cookies.length > 0) {
            await page.setCookie(...sessionData.cookies);
        }

        return true;
    }

    async restoreStorage(page, sessionData) {
        await page.evaluate((storage) => {
            if (storage.localStorage) {
                const localData = JSON.parse(storage.localStorage);
                for (const [key, value] of Object.entries(localData)) {
                    localStorage.setItem(key, value);
                }
            }

            if (storage.sessionStorage) {
                const sessionData = JSON.parse(storage.sessionStorage);
                for (const [key, value] of Object.entries(sessionData)) {
                    sessionStorage.setItem(key, value);
                }
            }
        }, sessionData.storage);
    }
}

// Usage
const sessionManager = new PuppeteerSessionManager();
const browser = await puppeteer.launch();
const page = await browser.newPage();

// Try to load existing session
const sessionLoaded = await sessionManager.loadSession(page);

if (!sessionLoaded) {
    // Perform login
    await page.goto('https://example.com/login');
    // ... login process ...

    // Save session after successful login
    await sessionManager.saveSession(page);
}

await page.goto('https://example.com/dashboard');

2. Domain-Specific Cookie Management

async function manageDomainCookies(page, domain) {
    // Get cookies for specific domain
    const allCookies = await page.cookies();
    const domainCookies = allCookies.filter(cookie => 
        cookie.domain === domain || cookie.domain === `.${domain}`
    );

    console.log(`Found ${domainCookies.length} cookies for ${domain}`);

    // Clear only domain-specific cookies
    if (domainCookies.length > 0) {
        await page.deleteCookie(...domainCookies);
    }

    return domainCookies;
}

Best Practices

  1. Always save sessions after successful authentication
  2. Implement session expiration checks
  3. Handle cookie domain restrictions properly
  4. Store session data securely (encrypt sensitive data)
  5. Clear sessions when no longer needed
  6. Test session restoration thoroughly
  7. Handle different storage types appropriately

Common Issues and Solutions

Issue: Cookies not being set

// Solution: Navigate to domain first, then set cookies
await page.goto('https://example.com');
await page.setCookie({
    name: 'session_id',
    value: 'abc123',
    domain: 'example.com'
});

Issue: Session not persisting

// Solution: Ensure cookies have proper expiration
await page.setCookie({
    name: 'session_id',
    value: 'abc123',
    domain: 'example.com',
    expires: Math.floor(Date.now() / 1000) + (24 * 60 * 60) // 24 hours
});

Browser session management in Puppeteer enables you to maintain stateful interactions across multiple page loads and browser instances, essential for complex automation and scraping scenarios.

Try WebScraping.AI for Your Web Scraping Needs

Looking for a powerful web scraping solution? WebScraping.AI provides an LLM-powered API that combines Chromium JavaScript rendering with rotating proxies for reliable data extraction.

Key Features:

  • AI-powered extraction: Ask questions about web pages or extract structured data fields
  • JavaScript rendering: Full Chromium browser support for dynamic content
  • Rotating proxies: Datacenter and residential proxies from multiple countries
  • Easy integration: Simple REST API with SDKs for Python, Ruby, PHP, and more
  • Reliable & scalable: Built for developers who need consistent results

Getting Started:

Get page content with AI analysis:

curl "https://api.webscraping.ai/ai/question?url=https://example.com&question=What is the main topic?&api_key=YOUR_API_KEY"

Extract structured data:

curl "https://api.webscraping.ai/ai/fields?url=https://example.com&fields[title]=Page title&fields[price]=Product price&api_key=YOUR_API_KEY"

Try in request builder

Related Questions

Get Started Now

WebScraping.AI provides rotating proxies, Chromium rendering and built-in HTML parser for web scraping
Icon