Table of contents

How can I implement page object model pattern with Selenium WebDriver?

The Page Object Model (POM) is a design pattern that creates an abstraction layer between your test code and the web pages you're testing. This pattern helps you create maintainable, reusable, and readable test automation code by encapsulating page elements and their interactions within dedicated page classes.

What is the Page Object Model Pattern?

The Page Object Model pattern involves creating a separate class for each web page in your application. Each page class contains:

  • Page Elements: Web elements as class attributes or properties
  • Page Methods: Actions that can be performed on the page
  • Page Verification: Methods to verify page state or content

This approach separates the test logic from the page-specific code, making your tests more maintainable and reducing code duplication.

Benefits of Using Page Object Model

  1. Maintainability: Changes to UI elements only require updates in one place
  2. Reusability: Page objects can be reused across multiple test cases
  3. Readability: Tests become more readable and business-focused
  4. Separation of Concerns: Test logic is separated from page interaction logic

Basic Page Object Implementation

Python Implementation

Here's a basic page object implementation in Python using Selenium WebDriver:

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

class BasePage:
    """Base page class with common functionality"""

    def __init__(self, driver):
        self.driver = driver
        self.wait = WebDriverWait(driver, 10)

    def find_element(self, locator):
        return self.wait.until(EC.presence_of_element_located(locator))

    def find_elements(self, locator):
        return self.driver.find_elements(*locator)

    def click(self, locator):
        element = self.find_element(locator)
        element.click()

    def enter_text(self, locator, text):
        element = self.find_element(locator)
        element.clear()
        element.send_keys(text)

class LoginPage(BasePage):
    """Login page object"""

    # Page elements (locators)
    USERNAME_FIELD = (By.ID, "username")
    PASSWORD_FIELD = (By.ID, "password")
    LOGIN_BUTTON = (By.XPATH, "//button[@type='submit']")
    ERROR_MESSAGE = (By.CLASS_NAME, "error-message")

    def __init__(self, driver):
        super().__init__(driver)
        self.url = "https://example.com/login"

    def navigate_to_login(self):
        self.driver.get(self.url)

    def enter_username(self, username):
        self.enter_text(self.USERNAME_FIELD, username)

    def enter_password(self, password):
        self.enter_text(self.PASSWORD_FIELD, password)

    def click_login_button(self):
        self.click(self.LOGIN_BUTTON)

    def login(self, username, password):
        """Complete login flow"""
        self.enter_username(username)
        self.enter_password(password)
        self.click_login_button()

    def get_error_message(self):
        try:
            return self.find_element(self.ERROR_MESSAGE).text
        except:
            return None

    def is_login_successful(self):
        return "dashboard" in self.driver.current_url

class DashboardPage(BasePage):
    """Dashboard page object"""

    WELCOME_MESSAGE = (By.CLASS_NAME, "welcome-message")
    LOGOUT_BUTTON = (By.ID, "logout")
    USER_MENU = (By.CLASS_NAME, "user-menu")

    def __init__(self, driver):
        super().__init__(driver)

    def get_welcome_message(self):
        return self.find_element(self.WELCOME_MESSAGE).text

    def logout(self):
        self.click(self.LOGOUT_BUTTON)

    def is_page_loaded(self):
        return self.find_element(self.WELCOME_MESSAGE).is_displayed()

JavaScript Implementation

Here's the equivalent implementation in JavaScript using Selenium WebDriver:

const { Builder, By, until } = require('selenium-webdriver');

class BasePage {
    constructor(driver) {
        this.driver = driver;
        this.timeout = 10000;
    }

    async findElement(locator) {
        return await this.driver.wait(until.elementLocated(locator), this.timeout);
    }

    async findElements(locator) {
        return await this.driver.findElements(locator);
    }

    async click(locator) {
        const element = await this.findElement(locator);
        await element.click();
    }

    async enterText(locator, text) {
        const element = await this.findElement(locator);
        await element.clear();
        await element.sendKeys(text);
    }

    async getText(locator) {
        const element = await this.findElement(locator);
        return await element.getText();
    }
}

class LoginPage extends BasePage {
    constructor(driver) {
        super(driver);
        this.url = 'https://example.com/login';

        // Page elements
        this.usernameField = By.id('username');
        this.passwordField = By.id('password');
        this.loginButton = By.xpath("//button[@type='submit']");
        this.errorMessage = By.className('error-message');
    }

    async navigateToLogin() {
        await this.driver.get(this.url);
    }

    async enterUsername(username) {
        await this.enterText(this.usernameField, username);
    }

    async enterPassword(password) {
        await this.enterText(this.passwordField, password);
    }

    async clickLoginButton() {
        await this.click(this.loginButton);
    }

    async login(username, password) {
        await this.enterUsername(username);
        await this.enterPassword(password);
        await this.clickLoginButton();
    }

    async getErrorMessage() {
        try {
            return await this.getText(this.errorMessage);
        } catch (error) {
            return null;
        }
    }

    async isLoginSuccessful() {
        const currentUrl = await this.driver.getCurrentUrl();
        return currentUrl.includes('dashboard');
    }
}

class DashboardPage extends BasePage {
    constructor(driver) {
        super(driver);

        this.welcomeMessage = By.className('welcome-message');
        this.logoutButton = By.id('logout');
        this.userMenu = By.className('user-menu');
    }

    async getWelcomeMessage() {
        return await this.getText(this.welcomeMessage);
    }

    async logout() {
        await this.click(this.logoutButton);
    }

    async isPageLoaded() {
        const element = await this.findElement(this.welcomeMessage);
        return await element.isDisplayed();
    }
}

module.exports = { LoginPage, DashboardPage };

Advanced Page Object Patterns

Page Factory Pattern

The Page Factory pattern uses annotations or decorators to initialize page elements automatically:

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

class LoginPageFactory:
    def __init__(self, driver):
        self.driver = driver
        self.wait = WebDriverWait(driver, 10)

    @property
    def username_field(self):
        return self.driver.find_element(By.ID, "username")

    @property
    def password_field(self):
        return self.driver.find_element(By.ID, "password")

    @property
    def login_button(self):
        return self.driver.find_element(By.XPATH, "//button[@type='submit']")

    def login(self, username, password):
        self.username_field.send_keys(username)
        self.password_field.send_keys(password)
        self.login_button.click()

Fluent Interface Pattern

Implement method chaining for more readable test code:

class FluentLoginPage(BasePage):
    def enter_username(self, username):
        self.enter_text(self.USERNAME_FIELD, username)
        return self

    def enter_password(self, password):
        self.enter_text(self.PASSWORD_FIELD, password)
        return self

    def click_login(self):
        self.click(self.LOGIN_BUTTON)
        return DashboardPage(self.driver)

# Usage with method chaining
dashboard = (LoginPage(driver)
    .navigate_to_login()
    .enter_username("testuser")
    .enter_password("password123")
    .click_login())

Writing Tests with Page Objects

Python Test Example

import unittest
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from webdriver_manager.chrome import ChromeDriverManager

class TestLogin(unittest.TestCase):
    def setUp(self):
        service = Service(ChromeDriverManager().install())
        self.driver = webdriver.Chrome(service=service)
        self.login_page = LoginPage(self.driver)
        self.dashboard_page = DashboardPage(self.driver)

    def test_successful_login(self):
        self.login_page.navigate_to_login()
        self.login_page.login("valid_user", "valid_password")

        # Verify login success
        self.assertTrue(self.login_page.is_login_successful())
        self.assertTrue(self.dashboard_page.is_page_loaded())

    def test_invalid_login(self):
        self.login_page.navigate_to_login()
        self.login_page.login("invalid_user", "invalid_password")

        # Verify error message
        error_message = self.login_page.get_error_message()
        self.assertIsNotNone(error_message)
        self.assertIn("Invalid credentials", error_message)

    def tearDown(self):
        self.driver.quit()

if __name__ == "__main__":
    unittest.main()

JavaScript Test Example

const { Builder } = require('selenium-webdriver');
const { LoginPage, DashboardPage } = require('./pages');

describe('Login Tests', () => {
    let driver;
    let loginPage;
    let dashboardPage;

    beforeAll(async () => {
        driver = await new Builder().forBrowser('chrome').build();
        loginPage = new LoginPage(driver);
        dashboardPage = new DashboardPage(driver);
    });

    afterAll(async () => {
        await driver.quit();
    });

    test('should login successfully with valid credentials', async () => {
        await loginPage.navigateToLogin();
        await loginPage.login('valid_user', 'valid_password');

        // Verify login success
        expect(await loginPage.isLoginSuccessful()).toBe(true);
        expect(await dashboardPage.isPageLoaded()).toBe(true);
    });

    test('should show error message with invalid credentials', async () => {
        await loginPage.navigateToLogin();
        await loginPage.login('invalid_user', 'invalid_password');

        // Verify error message
        const errorMessage = await loginPage.getErrorMessage();
        expect(errorMessage).not.toBeNull();
        expect(errorMessage).toContain('Invalid credentials');
    });
});

Best Practices for Page Object Model

1. Keep Page Objects Simple

Page objects should only contain methods that interact with the page. Avoid complex business logic or assertions within page objects:

class LoginPage(BasePage):
    def login(self, username, password):
        """Good: Simple page interaction"""
        self.enter_username(username)
        self.enter_password(password)
        self.click_login_button()

    def verify_login_success(self):
        """Bad: Assertion logic belongs in tests"""
        assert "dashboard" in self.driver.current_url

2. Use Descriptive Method Names

Method names should clearly describe what action they perform:

# Good
def click_submit_button(self):
    pass

def enter_search_query(self, query):
    pass

# Bad
def click(self):
    pass

def type(self, text):
    pass

3. Handle Dynamic Content

For pages with dynamic content, implement proper waiting strategies:

class ProductPage(BasePage):
    PRODUCT_TITLE = (By.CLASS_NAME, "product-title")
    PRICE_ELEMENT = (By.CLASS_NAME, "price")

    def wait_for_product_to_load(self):
        self.wait.until(EC.visibility_of_element_located(self.PRODUCT_TITLE))
        return self

    def get_product_price(self):
        self.wait_for_product_to_load()
        return self.find_element(self.PRICE_ELEMENT).text

4. Organize Page Objects by Functionality

Group related page objects and create a clear directory structure:

pages/
├── __init__.py
├── base_page.py
├── auth/
│   ├── __init__.py
│   ├── login_page.py
│   └── registration_page.py
├── dashboard/
│   ├── __init__.py
│   ├── dashboard_page.py
│   └── settings_page.py
└── products/
    ├── __init__.py
    ├── product_list_page.py
    └── product_detail_page.py

Common Page Object Model Pitfalls

1. Over-Engineering

Don't create page objects for every single element. Focus on logical groupings and reusable components.

2. Mixing Concerns

Keep page objects focused on page interactions. Don't include test data, assertions, or complex business logic.

3. Not Using Inheritance

Create base page classes to avoid code duplication for common functionality like navigation and waiting.

4. Hardcoding Waits

Use explicit waits instead of hardcoded sleep statements to handle dynamic content properly.

Java Implementation Example

Here's a complete Java implementation using Selenium WebDriver:

import org.openqa.selenium.WebDriver;
import org.openqa.selenium.WebElement;
import org.openqa.selenium.support.FindBy;
import org.openqa.selenium.support.PageFactory;
import org.openqa.selenium.support.ui.WebDriverWait;
import org.openqa.selenium.support.ui.ExpectedConditions;

public class BasePage {
    protected WebDriver driver;
    protected WebDriverWait wait;

    public BasePage(WebDriver driver) {
        this.driver = driver;
        this.wait = new WebDriverWait(driver, 10);
        PageFactory.initElements(driver, this);
    }

    protected void clickElement(WebElement element) {
        wait.until(ExpectedConditions.elementToBeClickable(element));
        element.click();
    }

    protected void enterText(WebElement element, String text) {
        wait.until(ExpectedConditions.visibilityOf(element));
        element.clear();
        element.sendKeys(text);
    }
}

public class LoginPage extends BasePage {
    @FindBy(id = "username")
    private WebElement usernameField;

    @FindBy(id = "password")
    private WebElement passwordField;

    @FindBy(xpath = "//button[@type='submit']")
    private WebElement loginButton;

    @FindBy(className = "error-message")
    private WebElement errorMessage;

    public LoginPage(WebDriver driver) {
        super(driver);
    }

    public void navigateToLogin() {
        driver.get("https://example.com/login");
    }

    public void enterUsername(String username) {
        enterText(usernameField, username);
    }

    public void enterPassword(String password) {
        enterText(passwordField, password);
    }

    public void clickLoginButton() {
        clickElement(loginButton);
    }

    public DashboardPage login(String username, String password) {
        enterUsername(username);
        enterPassword(password);
        clickLoginButton();
        return new DashboardPage(driver);
    }

    public String getErrorMessage() {
        try {
            return errorMessage.getText();
        } catch (Exception e) {
            return null;
        }
    }

    public boolean isLoginSuccessful() {
        return driver.getCurrentUrl().contains("dashboard");
    }
}

C# Implementation Example

Here's a C# implementation using Selenium WebDriver:

using OpenQA.Selenium;
using OpenQA.Selenium.Support.UI;
using SeleniumExtras.WaitHelpers;

public class BasePage
{
    protected IWebDriver driver;
    protected WebDriverWait wait;

    public BasePage(IWebDriver driver)
    {
        this.driver = driver;
        this.wait = new WebDriverWait(driver, TimeSpan.FromSeconds(10));
    }

    protected IWebElement FindElement(By locator)
    {
        return wait.Until(ExpectedConditions.ElementExists(locator));
    }

    protected void ClickElement(By locator)
    {
        var element = wait.Until(ExpectedConditions.ElementToBeClickable(locator));
        element.Click();
    }

    protected void EnterText(By locator, string text)
    {
        var element = FindElement(locator);
        element.Clear();
        element.SendKeys(text);
    }
}

public class LoginPage : BasePage
{
    private readonly By usernameField = By.Id("username");
    private readonly By passwordField = By.Id("password");
    private readonly By loginButton = By.XPath("//button[@type='submit']");
    private readonly By errorMessage = By.ClassName("error-message");

    public LoginPage(IWebDriver driver) : base(driver)
    {
    }

    public void NavigateToLogin()
    {
        driver.Navigate().GoToUrl("https://example.com/login");
    }

    public void EnterUsername(string username)
    {
        EnterText(usernameField, username);
    }

    public void EnterPassword(string password)
    {
        EnterText(passwordField, password);
    }

    public void ClickLoginButton()
    {
        ClickElement(loginButton);
    }

    public DashboardPage Login(string username, string password)
    {
        EnterUsername(username);
        EnterPassword(password);
        ClickLoginButton();
        return new DashboardPage(driver);
    }

    public string GetErrorMessage()
    {
        try
        {
            return FindElement(errorMessage).Text;
        }
        catch
        {
            return null;
        }
    }

    public bool IsLoginSuccessful()
    {
        return driver.Url.Contains("dashboard");
    }
}

Integration with Test Frameworks

The Page Object Model pattern works well with various testing frameworks. When implementing web scraping solutions, similar patterns can be applied to organize your scraping code effectively, much like how modern browser automation tools handle complex page interactions with structured approaches.

For teams working with multiple automation tools, understanding how different frameworks handle page interactions can be valuable. The concepts learned from implementing Page Object Model with Selenium WebDriver can be adapted to other automation scenarios, including handling complex web applications that require sophisticated interaction patterns.

Conclusion

The Page Object Model pattern is essential for creating maintainable and scalable test automation suites with Selenium WebDriver. By encapsulating page elements and interactions within dedicated classes, you create a clear separation between test logic and page-specific code. This approach reduces maintenance overhead, improves code reusability, and makes your tests more readable and reliable.

Remember to keep your page objects simple, use descriptive method names, handle dynamic content properly, and organize your code structure logically. With these practices, you'll build robust test automation frameworks that can evolve with your application's changing requirements.

Try WebScraping.AI for Your Web Scraping Needs

Looking for a powerful web scraping solution? WebScraping.AI provides an LLM-powered API that combines Chromium JavaScript rendering with rotating proxies for reliable data extraction.

Key Features:

  • AI-powered extraction: Ask questions about web pages or extract structured data fields
  • JavaScript rendering: Full Chromium browser support for dynamic content
  • Rotating proxies: Datacenter and residential proxies from multiple countries
  • Easy integration: Simple REST API with SDKs for Python, Ruby, PHP, and more
  • Reliable & scalable: Built for developers who need consistent results

Getting Started:

Get page content with AI analysis:

curl "https://api.webscraping.ai/ai/question?url=https://example.com&question=What is the main topic?&api_key=YOUR_API_KEY"

Extract structured data:

curl "https://api.webscraping.ai/ai/fields?url=https://example.com&fields[title]=Page title&fields[price]=Product price&api_key=YOUR_API_KEY"

Try in request builder

Related Questions

Get Started Now

WebScraping.AI provides rotating proxies, Chromium rendering and built-in HTML parser for web scraping
Icon