Can I access the HTML source of a page using Playwright?

Yes, you can access the HTML source of a webpage using Playwright. Playwright is a Node.js library to automate Chromium, Firefox, and WebKit browsers with a single API. It allows you to control headless (no GUI) or non-headless browsers and provides functionality for web scraping, simulating user interaction and much more.

Here is how you can do it in both JavaScript and Python:

JavaScript

In JavaScript, you can use the content() function to get the HTML content of a page. Here's an example:

const playwright = require('playwright');

(async () => {
  const browser = await playwright.chromium.launch();
  const context = await browser.newContext();
  const page = await context.newPage();

  await page.goto('https://example.com');
  const htmlContent = await page.content();

  console.log(htmlContent);

  await browser.close();
})();

In this script, we first initialize a new browser context, then create a new page, navigate to 'https://example.com' and then get the HTML content of the page using page.content().

Python

Similarly, in Python, you can use the content() function to get the HTML content of a page. Here's an example:

from playwright.sync_api import sync_playwright

with sync_playwright() as p:
    browser = p.chromium.launch()
    context = browser.new_context()
    page = context.new_page()

    page.goto('https://example.com')
    html_content = page.content()

    print(html_content)

    browser.close()

In this script, we're doing the same thing as in the JavaScript version: initialize a new browser context, create a new page, navigate to 'https://example.com', and then get the HTML content of the page using page.content().

Remember that these examples will return the HTML content of the page after any JavaScript has been executed. This means that if the page uses JavaScript to load additional content, this content will be included in the returned HTML.

Related Questions

Get Started Now

WebScraping.AI provides rotating proxies, Chromium rendering and built-in HTML parser for web scraping
Icon