How can I use Puppeteer with Docker?

Running Puppeteer in Docker containers is essential for deploying web scraping applications in production environments. This guide covers modern best practices for containerizing Puppeteer applications.

Quick Start Dockerfile

Here's a production-ready Dockerfile using the latest Node.js and security best practices:

FROM node:18-slim

# Install Chrome dependencies
RUN apt-get update \
    && apt-get install -y wget gnupg \
    && wget -q -O - https://dl-ssl.google.com/linux/linux_signing_key.pub | apt-key add - \
    && sh -c 'echo "deb [arch=amd64] http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google.list' \
    && apt-get update \
    && apt-get install -y google-chrome-stable fonts-ipafont-gothic fonts-wqy-zenhei fonts-thai-tlwg fonts-kacst fonts-freefont-ttf libxss1 \
    && rm -rf /var/lib/apt/lists/*

# Create non-root user
RUN groupadd -r pptruser && useradd -r -g pptruser -G audio,video pptruser \
    && mkdir -p /home/pptruser/Downloads \
    && chown -R pptruser:pptruser /home/pptruser

# Set working directory
WORKDIR /app

# Copy package files
COPY package*.json ./

# Install dependencies
RUN npm ci --only=production

# Copy application code
COPY . .

# Change ownership to non-root user
RUN chown -R pptruser:pptruser /app

# Switch to non-root user
USER pptruser

CMD ["node", "index.js"]

Alternative: Using Official Puppeteer Image

For simpler setups, use the official Puppeteer Docker image:

FROM ghcr.io/puppeteer/puppeteer:21.5.2

# Copy package files
COPY package*.json ./

# Install dependencies (Puppeteer already installed)
RUN npm ci --omit=dev

# Copy application code
COPY . .

CMD ["node", "index.js"]

Secure Container Configuration

1. Build and Run Commands

# Build the image
docker build -t puppeteer-app .

# Run with security sandbox (recommended)
docker run --rm --init --cap-add=SYS_ADMIN \
  --security-opt seccomp=unconfined \
  puppeteer-app

# Run without sandbox (less secure but simpler)
docker run --rm --init puppeteer-app

2. Docker Compose Setup

version: '3.8'
services:
  puppeteer:
    build: .
    init: true
    cap_add:
      - SYS_ADMIN
    security_opt:
      - seccomp:unconfined
    volumes:
      - ./output:/app/output
    environment:
      - NODE_ENV=production

Puppeteer Configuration for Docker

Basic Configuration

const puppeteer = require('puppeteer');

const launchOptions = {
  headless: 'new',
  args: [
    '--no-sandbox',
    '--disable-setuid-sandbox',
    '--disable-dev-shm-usage',
    '--disable-accelerated-2d-canvas',
    '--no-first-run',
    '--no-zygote',
    '--single-process',
    '--disable-gpu'
  ]
};

(async () => {
  const browser = await puppeteer.launch(launchOptions);
  const page = await browser.newPage();

  await page.goto('https://example.com');
  const screenshot = await page.screenshot({ 
    path: '/app/output/screenshot.png',
    fullPage: true 
  });

  await browser.close();
})();

Production-Ready Example

const puppeteer = require('puppeteer');

class DockerPuppeteerService {
  constructor() {
    this.launchOptions = {
      headless: 'new',
      args: [
        '--no-sandbox',
        '--disable-setuid-sandbox',
        '--disable-dev-shm-usage',
        '--disable-accelerated-2d-canvas',
        '--no-first-run',
        '--no-zygote',
        '--disable-gpu',
        '--disable-web-security',
        '--disable-features=VizDisplayCompositor'
      ],
      timeout: 30000
    };
  }

  async scrapeWithRetry(url, maxRetries = 3) {
    let browser;

    for (let attempt = 1; attempt <= maxRetries; attempt++) {
      try {
        browser = await puppeteer.launch(this.launchOptions);
        const page = await browser.newPage();

        await page.setViewport({ width: 1920, height: 1080 });
        await page.goto(url, { waitUntil: 'networkidle2', timeout: 15000 });

        const data = await page.evaluate(() => {
          return {
            title: document.title,
            content: document.body.innerText,
            url: window.location.href
          };
        });

        return data;

      } catch (error) {
        console.error(`Attempt ${attempt} failed:`, error.message);
        if (attempt === maxRetries) throw error;
        await new Promise(resolve => setTimeout(resolve, 1000 * attempt));

      } finally {
        if (browser) {
          await browser.close();
        }
      }
    }
  }
}

// Usage
(async () => {
  const scraper = new DockerPuppeteerService();
  try {
    const result = await scraper.scrapeWithRetry('https://example.com');
    console.log('Scraped data:', result);
  } catch (error) {
    console.error('Scraping failed:', error);
    process.exit(1);
  }
})();

Troubleshooting Common Issues

Memory Issues

If you encounter memory problems, increase shared memory:

docker run --rm --init --shm-size=1gb \
  --cap-add=SYS_ADMIN puppeteer-app

Permission Errors

Ensure proper user permissions in your Dockerfile:

# Create and switch to non-root user
RUN addgroup --system --gid 1001 nodejs
RUN adduser --system --uid 1001 --gid 1001 nodejs
USER nodejs

Font Rendering Issues

Install additional fonts for international content:

RUN apt-get update && apt-get install -y \
  fonts-liberation \
  fonts-noto-color-emoji \
  fonts-noto-cjk-extra \
  && rm -rf /var/lib/apt/lists/*

Best Practices

Use multi-stage builds to reduce image size
Pin specific versions of Node.js and Puppeteer
Run as non-root user for security
Set resource limits in production
Use health checks to monitor container status
Handle graceful shutdowns with proper signal handling

This setup provides a robust foundation for running Puppeteer applications in Docker containers with proper security and performance considerations.

Table of contents

How can I use Puppeteer with Docker?

Quick Start Dockerfile

Alternative: Using Official Puppeteer Image

Secure Container Configuration

1. Build and Run Commands

2. Docker Compose Setup

Puppeteer Configuration for Docker

Basic Configuration

Production-Ready Example

Troubleshooting Common Issues

Memory Issues

Permission Errors

Font Rendering Issues

Best Practices

Try WebScraping.AI for Your Web Scraping Needs

Key Features:

Getting Started:

Related Questions

How to handle errors in Puppeteer?

How to crawl a single page application (SPA) using Puppeteer?

How to handle file downloads in Puppeteer?

Get Started Now