Running Puppeteer in Docker containers is essential for deploying web scraping applications in production environments. This guide covers modern best practices for containerizing Puppeteer applications.
Quick Start Dockerfile
Here's a production-ready Dockerfile using the latest Node.js and security best practices:
FROM node:18-slim
# Install Chrome dependencies
RUN apt-get update \
&& apt-get install -y wget gnupg \
&& wget -q -O - https://dl-ssl.google.com/linux/linux_signing_key.pub | apt-key add - \
&& sh -c 'echo "deb [arch=amd64] http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google.list' \
&& apt-get update \
&& apt-get install -y google-chrome-stable fonts-ipafont-gothic fonts-wqy-zenhei fonts-thai-tlwg fonts-kacst fonts-freefont-ttf libxss1 \
&& rm -rf /var/lib/apt/lists/*
# Create non-root user
RUN groupadd -r pptruser && useradd -r -g pptruser -G audio,video pptruser \
&& mkdir -p /home/pptruser/Downloads \
&& chown -R pptruser:pptruser /home/pptruser
# Set working directory
WORKDIR /app
# Copy package files
COPY package*.json ./
# Install dependencies
RUN npm ci --only=production
# Copy application code
COPY . .
# Change ownership to non-root user
RUN chown -R pptruser:pptruser /app
# Switch to non-root user
USER pptruser
CMD ["node", "index.js"]
Alternative: Using Official Puppeteer Image
For simpler setups, use the official Puppeteer Docker image:
FROM ghcr.io/puppeteer/puppeteer:21.5.2
# Copy package files
COPY package*.json ./
# Install dependencies (Puppeteer already installed)
RUN npm ci --omit=dev
# Copy application code
COPY . .
CMD ["node", "index.js"]
Secure Container Configuration
1. Build and Run Commands
# Build the image
docker build -t puppeteer-app .
# Run with security sandbox (recommended)
docker run --rm --init --cap-add=SYS_ADMIN \
--security-opt seccomp=unconfined \
puppeteer-app
# Run without sandbox (less secure but simpler)
docker run --rm --init puppeteer-app
2. Docker Compose Setup
version: '3.8'
services:
puppeteer:
build: .
init: true
cap_add:
- SYS_ADMIN
security_opt:
- seccomp:unconfined
volumes:
- ./output:/app/output
environment:
- NODE_ENV=production
Puppeteer Configuration for Docker
Basic Configuration
const puppeteer = require('puppeteer');
const launchOptions = {
headless: 'new',
args: [
'--no-sandbox',
'--disable-setuid-sandbox',
'--disable-dev-shm-usage',
'--disable-accelerated-2d-canvas',
'--no-first-run',
'--no-zygote',
'--single-process',
'--disable-gpu'
]
};
(async () => {
const browser = await puppeteer.launch(launchOptions);
const page = await browser.newPage();
await page.goto('https://example.com');
const screenshot = await page.screenshot({
path: '/app/output/screenshot.png',
fullPage: true
});
await browser.close();
})();
Production-Ready Example
const puppeteer = require('puppeteer');
class DockerPuppeteerService {
constructor() {
this.launchOptions = {
headless: 'new',
args: [
'--no-sandbox',
'--disable-setuid-sandbox',
'--disable-dev-shm-usage',
'--disable-accelerated-2d-canvas',
'--no-first-run',
'--no-zygote',
'--disable-gpu',
'--disable-web-security',
'--disable-features=VizDisplayCompositor'
],
timeout: 30000
};
}
async scrapeWithRetry(url, maxRetries = 3) {
let browser;
for (let attempt = 1; attempt <= maxRetries; attempt++) {
try {
browser = await puppeteer.launch(this.launchOptions);
const page = await browser.newPage();
await page.setViewport({ width: 1920, height: 1080 });
await page.goto(url, { waitUntil: 'networkidle2', timeout: 15000 });
const data = await page.evaluate(() => {
return {
title: document.title,
content: document.body.innerText,
url: window.location.href
};
});
return data;
} catch (error) {
console.error(`Attempt ${attempt} failed:`, error.message);
if (attempt === maxRetries) throw error;
await new Promise(resolve => setTimeout(resolve, 1000 * attempt));
} finally {
if (browser) {
await browser.close();
}
}
}
}
}
// Usage
(async () => {
const scraper = new DockerPuppeteerService();
try {
const result = await scraper.scrapeWithRetry('https://example.com');
console.log('Scraped data:', result);
} catch (error) {
console.error('Scraping failed:', error);
process.exit(1);
}
})();
Troubleshooting Common Issues
Memory Issues
If you encounter memory problems, increase shared memory:
docker run --rm --init --shm-size=1gb \
--cap-add=SYS_ADMIN puppeteer-app
Permission Errors
Ensure proper user permissions in your Dockerfile:
# Create and switch to non-root user
RUN addgroup --system --gid 1001 nodejs
RUN adduser --system --uid 1001 --gid 1001 nodejs
USER nodejs
Font Rendering Issues
Install additional fonts for international content:
RUN apt-get update && apt-get install -y \
fonts-liberation \
fonts-noto-color-emoji \
fonts-noto-cjk-extra \
&& rm -rf /var/lib/apt/lists/*
Best Practices
- Use multi-stage builds to reduce image size
- Pin specific versions of Node.js and Puppeteer
- Run as non-root user for security
- Set resource limits in production
- Use health checks to monitor container status
- Handle graceful shutdowns with proper signal handling
This setup provides a robust foundation for running Puppeteer applications in Docker containers with proper security and performance considerations.