Table of contents

Can I use Simple HTML DOM with PHP 7.x or 8.x?

Compatibility Status

Simple HTML DOM is not officially compatible with PHP 7.x or 8.x. The original library was designed for older PHP versions and has not been actively maintained to support modern PHP features and syntax changes.

Common Issues and Errors

When using Simple HTML DOM with PHP 7.x or 8.x, you'll encounter several critical errors:

1. Deprecated each() Function

// Error: Function each() is deprecated
Fatal error: Uncaught Error: Call to undefined function each()

2. Constructor Issues

// PHP 7+ requires proper constructor syntax
Fatal error: Cannot use 'SimpleHtmlDom' as class name as it is reserved

3. Property Access Warnings

// Indirect property access warnings
Notice: Indirect modification of overloaded property

4. Type Declaration Issues

Modern PHP's stricter type checking causes additional compatibility problems with the library's internal methods.

Solutions and Alternatives

Option 1: Use a Maintained Fork

The sunra/php-simple-html-dom-parser package is a Composer-compatible fork that fixes PHP 7.x/8.x compatibility:

composer require sunra/php-simple-html-dom-parser
<?php
require 'vendor/autoload.php';

use Sunra\PhpSimple\HtmlDomParser;

// Load HTML from URL
$dom = HtmlDomParser::file_get_html('https://example.com');

// Find all paragraphs
foreach($dom->find('p') as $paragraph) {
    echo $paragraph->plaintext . "\n";
}

// Find elements by class
$items = $dom->find('.item');
foreach($items as $item) {
    echo $item->innertext . "\n";
}

Option 2: PHP's Built-in DOMDocument

DOMDocument is fully compatible with all PHP versions and provides robust HTML parsing:

<?php
$dom = new DOMDocument();
libxml_use_internal_errors(true); // Suppress HTML5 warnings

// Load HTML
$html = file_get_contents('https://example.com');
$dom->loadHTML($html);

// Using XPath for complex queries
$xpath = new DOMXPath($dom);

// Find elements by class
$nodes = $xpath->query("//div[@class='content']");
foreach ($nodes as $node) {
    echo $node->textContent . "\n";
}

// Find all links
$links = $dom->getElementsByTagName('a');
foreach ($links as $link) {
    echo $link->getAttribute('href') . " - " . $link->textContent . "\n";
}

Option 3: Symfony DOMCrawler

Symfony DOMCrawler provides a jQuery-like interface for HTML manipulation:

composer require symfony/dom-crawler symfony/css-selector
<?php
require 'vendor/autoload.php';

use Symfony\Component\DomCrawler\Crawler;

$html = file_get_contents('https://example.com');
$crawler = new Crawler($html);

// CSS selectors (requires symfony/css-selector)
$titles = $crawler->filter('h1, h2, h3');
$titles->each(function (Crawler $node) {
    echo $node->text() . "\n";
});

// Find by attribute
$links = $crawler->filter('a[href*="github"]');
$links->each(function (Crawler $node) {
    echo $node->attr('href') . "\n";
});

// Extract data
$data = $crawler->filter('.product')->each(function (Crawler $node) {
    return [
        'name' => $node->filter('.name')->text(),
        'price' => $node->filter('.price')->text(),
        'url' => $node->filter('a')->attr('href')
    ];
});

Option 4: Other Modern Libraries

DiDOM: A fast HTML/XML parser with a simple API:

composer require imangazaliev/didom
<?php
use DiDom\Document;

$document = new Document('https://example.com', true);
$posts = $document->find('.post');

foreach ($posts as $post) {
    echo $post->text() . "\n";
}

QueryPath: jQuery-style syntax for PHP:

composer require querypath/querypath
<?php
use QueryPath\DOMQuery;

$html = file_get_contents('https://example.com');
$qp = qp($html);

$qp->find('p')->each(function($index, $element) {
    echo qp($element)->text() . "\n";
});

Migration Guide

If migrating from Simple HTML DOM, here's a comparison of common operations:

| Simple HTML DOM | DOMDocument | Symfony DOMCrawler | |----------------|-------------|-------------------| | $html->find('p') | $dom->getElementsByTagName('p') | $crawler->filter('p') | | $element->plaintext | $element->textContent | $node->text() | | $element->innertext | $element->nodeValue | $node->html() | | $element->href | $element->getAttribute('href') | $node->attr('href') |

Recommendation

For new projects, use Symfony DOMCrawler for its powerful CSS selector support and modern API. For simple parsing needs, PHP's built-in DOMDocument is sufficient and has no external dependencies.

Avoid using the original Simple HTML DOM library with PHP 7.x/8.x unless you use a maintained fork like sunra/php-simple-html-dom-parser.

Try WebScraping.AI for Your Web Scraping Needs

Looking for a powerful web scraping solution? WebScraping.AI provides an LLM-powered API that combines Chromium JavaScript rendering with rotating proxies for reliable data extraction.

Key Features:

  • AI-powered extraction: Ask questions about web pages or extract structured data fields
  • JavaScript rendering: Full Chromium browser support for dynamic content
  • Rotating proxies: Datacenter and residential proxies from multiple countries
  • Easy integration: Simple REST API with SDKs for Python, Ruby, PHP, and more
  • Reliable & scalable: Built for developers who need consistent results

Getting Started:

Get page content with AI analysis:

curl "https://api.webscraping.ai/ai/question?url=https://example.com&question=What is the main topic?&api_key=YOUR_API_KEY"

Extract structured data:

curl "https://api.webscraping.ai/ai/fields?url=https://example.com&fields[title]=Page title&fields[price]=Product price&api_key=YOUR_API_KEY"

Try in request builder

Related Questions

Get Started Now

WebScraping.AI provides rotating proxies, Chromium rendering and built-in HTML parser for web scraping
Icon