Table of contents

How can I use Puppeteer in a CI/CD pipeline?

Integrating Puppeteer into CI/CD pipelines requires proper environment setup, headless browser configuration, and handling of common challenges like system dependencies and resource constraints.

Key Requirements for CI/CD Integration

1. Headless Browser Configuration

Puppeteer must run in headless mode in CI environments. Configure launch options appropriately:

const puppeteer = require('puppeteer');

const browser = await puppeteer.launch({
  headless: true,
  args: [
    '--no-sandbox',
    '--disable-setuid-sandbox',
    '--disable-dev-shm-usage',
    '--disable-accelerated-2d-canvas',
    '--no-first-run',
    '--no-zygote',
    '--single-process',
    '--disable-gpu'
  ]
});

2. System Dependencies

Install required system packages for Chromium to run properly:

# For Ubuntu/Debian-based systems
apt-get update && apt-get install -y \
  wget \
  ca-certificates \
  fonts-liberation \
  libasound2 \
  libatk-bridge2.0-0 \
  libdrm2 \
  libxcomposite1 \
  libxdamage1 \
  libxrandr2 \
  libgbm1 \
  libxss1 \
  libnss3

Platform-Specific Configurations

GitHub Actions

name: Puppeteer Tests
on: [push, pull_request]

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: '18'
      - run: npm ci
      - run: npm test

GitLab CI

image: node:18

stages:
  - test

test:
  stage: test
  before_script:
    - apt-get update -qq && apt-get install -y -qq git
    - npm ci
  script:
    - npm test
  cache:
    paths:
      - node_modules/

CircleCI

version: 2.1
executors:
  node-executor:
    docker:
      - image: cimg/node:18.17-browsers

jobs:
  test:
    executor: node-executor
    steps:
      - checkout
      - restore_cache:
          keys:
            - v1-dependencies-{{ checksum "package-lock.json" }}
      - run: npm ci
      - save_cache:
          paths:
            - node_modules
          key: v1-dependencies-{{ checksum "package-lock.json" }}
      - run: npm test

Docker Integration

Dockerfile for Puppeteer

FROM node:18-slim

# Install system dependencies
RUN apt-get update \
    && apt-get install -y wget gnupg \
    && wget -q -O - https://dl-ssl.google.com/linux/linux_signing_key.pub | apt-key add - \
    && sh -c 'echo "deb [arch=amd64] http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google.list' \
    && apt-get update \
    && apt-get install -y google-chrome-unstable fonts-ipafont-gothic fonts-wqy-zenhei fonts-thai-tlwg fonts-kacst fonts-freefont-ttf \
      --no-install-recommends \
    && rm -rf /var/lib/apt/lists/*

WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production

COPY . .
CMD ["npm", "test"]

Docker Compose for Testing

version: '3.8'
services:
  puppeteer-tests:
    build: .
    environment:
      - NODE_ENV=test
    volumes:
      - ./test-results:/app/test-results

Comprehensive Test Examples

End-to-End Test with Error Handling

const puppeteer = require('puppeteer');

describe('E2E Tests', () => {
  let browser;
  let page;

  beforeAll(async () => {
    browser = await puppeteer.launch({
      headless: process.env.CI === 'true',
      args: ['--no-sandbox', '--disable-setuid-sandbox']
    });
  });

  afterAll(async () => {
    await browser.close();
  });

  beforeEach(async () => {
    page = await browser.newPage();
    await page.setViewport({ width: 1280, height: 720 });
  });

  afterEach(async () => {
    await page.close();
  });

  test('should load homepage and check title', async () => {
    await page.goto('https://example.com', { 
      waitUntil: 'networkidle2',
      timeout: 30000 
    });

    const title = await page.title();
    expect(title).toMatch(/Example Domain/);
  });

  test('should handle form submission', async () => {
    await page.goto('https://example.com/contact');

    await page.type('#name', 'Test User');
    await page.type('#email', 'test@example.com');
    await page.type('#message', 'Test message');

    await Promise.all([
      page.waitForNavigation(),
      page.click('#submit')
    ]);

    const successMessage = await page.$eval('.success', el => el.textContent);
    expect(successMessage).toContain('Message sent successfully');
  });
});

Performance Testing

const puppeteer = require('puppeteer');

describe('Performance Tests', () => {
  test('should meet performance thresholds', async () => {
    const browser = await puppeteer.launch({ headless: true });
    const page = await browser.newPage();

    await page.goto('https://example.com');

    const performanceMetrics = await page.evaluate(() => {
      const navigation = performance.getEntriesByType('navigation')[0];
      return {
        domContentLoaded: navigation.domContentLoadedEventEnd - navigation.domContentLoadedEventStart,
        loadComplete: navigation.loadEventEnd - navigation.loadEventStart
      };
    });

    expect(performanceMetrics.domContentLoaded).toBeLessThan(2000);
    expect(performanceMetrics.loadComplete).toBeLessThan(5000);

    await browser.close();
  });
});

Common CI/CD Challenges and Solutions

Memory Management

// Limit concurrent pages to prevent memory issues
const MAX_CONCURRENT_PAGES = 3;
const semaphore = new Array(MAX_CONCURRENT_PAGES).fill(null);

async function runTestWithSemaphore(testFn) {
  await new Promise(resolve => {
    const tryAcquire = () => {
      const index = semaphore.findIndex(slot => slot === null);
      if (index !== -1) {
        semaphore[index] = true;
        resolve();
      } else {
        setTimeout(tryAcquire, 100);
      }
    };
    tryAcquire();
  });

  try {
    await testFn();
  } finally {
    const index = semaphore.findIndex(slot => slot === true);
    semaphore[index] = null;
  }
}

Screenshot and Artifact Collection

// Capture screenshots on test failure
afterEach(async () => {
  if (this.currentTest.state === 'failed') {
    const screenshot = await page.screenshot({ fullPage: true });
    const testName = this.currentTest.title.replace(/\s+/g, '-');
    await fs.writeFile(`screenshots/${testName}.png`, screenshot);
  }
});

Environment-Specific Configuration

// config/puppeteer.config.js
module.exports = {
  development: {
    headless: false,
    slowMo: 250,
    devtools: true
  },
  ci: {
    headless: true,
    args: [
      '--no-sandbox',
      '--disable-setuid-sandbox',
      '--disable-dev-shm-usage'
    ]
  },
  production: {
    headless: true,
    args: ['--no-sandbox', '--disable-setuid-sandbox']
  }
};

// Usage
const config = require('./config/puppeteer.config')[process.env.NODE_ENV || 'development'];
const browser = await puppeteer.launch(config);

Best Practices

  1. Use headless mode in CI environments
  2. Set appropriate timeouts for page loads and element waits
  3. Implement retry logic for flaky tests
  4. Clean up resources properly (close browsers and pages)
  5. Use screenshot capture for debugging failed tests
  6. Limit concurrent browser instances to manage memory usage
  7. Cache Chromium downloads to speed up CI builds

This comprehensive setup ensures reliable Puppeteer execution in various CI/CD environments while handling common pitfalls and optimization opportunities.

Try WebScraping.AI for Your Web Scraping Needs

Looking for a powerful web scraping solution? WebScraping.AI provides an LLM-powered API that combines Chromium JavaScript rendering with rotating proxies for reliable data extraction.

Key Features:

  • AI-powered extraction: Ask questions about web pages or extract structured data fields
  • JavaScript rendering: Full Chromium browser support for dynamic content
  • Rotating proxies: Datacenter and residential proxies from multiple countries
  • Easy integration: Simple REST API with SDKs for Python, Ruby, PHP, and more
  • Reliable & scalable: Built for developers who need consistent results

Getting Started:

Get page content with AI analysis:

curl "https://api.webscraping.ai/ai/question?url=https://example.com&question=What is the main topic?&api_key=YOUR_API_KEY"

Extract structured data:

curl "https://api.webscraping.ai/ai/fields?url=https://example.com&fields[title]=Page title&fields[price]=Product price&api_key=YOUR_API_KEY"

Try in request builder

Related Questions

Get Started Now

WebScraping.AI provides rotating proxies, Chromium rendering and built-in HTML parser for web scraping
Icon