How do I save the scraped data into a database using DiDOM?

DiDOM is a simple and fast HTML parser written in PHP. If you're using DiDOM to scrape data from web pages and you want to save the scraped data into a database, you'll typically follow this workflow:

  1. Use DiDOM to parse the HTML and extract the necessary data.
  2. Prepare a connection to your database using PHP's PDO or mysqli.
  3. Insert the data into the database using prepared statements.

Here is an example of how you could do this:

First, make sure you have DiDOM installed. You can install it via Composer:

composer require imangazaliev/didom

Now, let's assume you've already scraped some data using DiDOM and you want to save it into a MySQL database.

<?php

require_once 'vendor/autoload.php';

use DiDom\Document;

// Your scraped data
$html = '<html><body><h1>Hello World!</h1></body></html>';

// Create a new document instance
$document = new Document($html);

// Find the data you want to scrape, for example, an <h1> element
$h1 = $document->first('h1')->text();

// Database configuration
$host = '127.0.0.1';
$db   = 'your_database';
$user = 'your_username';
$pass = 'your_password';
$charset = 'utf8mb4';

// Set up the DSN (Data Source Name)
$dsn = "mysql:host=$host;dbname=$db;charset=$charset";
$options = [
    PDO::ATTR_ERRMODE            => PDO::ERRMODE_EXCEPTION,
    PDO::ATTR_DEFAULT_FETCH_MODE => PDO::FETCH_ASSOC,
    PDO::ATTR_EMULATE_PREPARES   => false,
];

try {
    // Create a PDO instance (connect to the database)
    $pdo = new PDO($dsn, $user, $pass, $options);

    // Prepare an INSERT statement
    $stmt = $pdo->prepare("INSERT INTO your_table (column_name) VALUES (:value)");

    // Bind the value to the placeholder in the statement
    $stmt->bindParam(':value', $h1);

    // Execute the statement to insert the data
    $stmt->execute();

    echo "Data inserted successfully!";
} catch (\PDOException $e) {
    // Handle any errors
    throw new \PDOException($e->getMessage(), (int)$e->getCode());
}

In this example, replace your_database, your_username, your_password, your_table, and column_name with the actual values for your database setup.

Keep in mind that this is a very basic example. In a real-world scenario, you should do additional things such as:

  • Handle exceptions and errors more gracefully.
  • Sanitize and validate the data before inserting it into the database.
  • Check if the data already exists in the database to prevent duplicates.
  • Use transactions if you are doing multiple related inserts/updates to ensure data integrity.

Remember that the structure and design of your database will depend on the data you are scraping and how you intend to use it. Always design your database tables to best fit the structure of your data and the needs of your application.

Related Questions

Get Started Now

WebScraping.AI provides rotating proxies, Chromium rendering and built-in HTML parser for web scraping
Icon