Goutte is a screen scraping and web crawling library for PHP. It provides an API to simulate browser actions, such as clicking links and submitting forms. Under the hood, Goutte uses Guzzle, a PHP HTTP client, to make requests.
When you want to submit a form using Goutte, you first need to navigate to the page with the form, select the form, fill in the necessary fields, and then submit it. Goutte handles form submissions similarly to how a browser would.
Here's a step-by-step guide on how to submit forms with Goutte:
- Install Goutte: If you haven't already installed Goutte, you can do so using Composer. Run the following command in your project directory:
composer require fabpot/goutte
- Create a Goutte Client Instance: Before you can submit a form, you need to create an instance of the Goutte client.
require 'vendor/autoload.php';
use Goutte\Client;
$client = new Client();
- Navigate to the Page with the Form: Use the
request
method to navigate to the page that contains the form you want to submit.
$crawler = $client->request('GET', 'https://example.com/form-page');
- Select the Form: Use the
selectButton
method to select the form by the button's text or name. Then callform
to get a form object.
$form = $crawler->selectButton('Submit')->form();
- Fill in the Form Fields: Fill in the form fields by passing an associative array to the
form
method. The keys should be the names of the form fields, and the values should be the values you want to submit.
$form['field_name'] = 'value'; // for input fields
$form['checkbox_name'] = true; // for checkboxes
$form['select_name'] = 'option_value'; // for select dropdowns
// etc.
- Submit the Form: Use the
submit
method to submit the form.
$resultCrawler = $client->submit($form);
- Process the Response: After submitting the form, you can process the response contained within the
$resultCrawler
object.
Here's a full example that puts all the steps together:
require 'vendor/autoload.php';
use Goutte\Client;
$client = new Client();
// Navigate to the login page
$crawler = $client->request('GET', 'https://example.com/login');
// Select the form and set the values
$form = $crawler->selectButton('Sign in')->form([
'username' => 'user@example.com',
'password' => 'password123',
]);
// Submit the form
$resultCrawler = $client->submit($form);
// You can now parse the response, for example, check if login was successful
if ($resultCrawler->filter('div.success')->count() > 0) {
echo "Login successful!\n";
} else {
echo "Login failed.\n";
}
Remember to handle any redirection if the form submission causes a redirect. Goutte should automatically follow redirects by default, but you can control this behavior using the followRedirects
method.
Also, keep in mind that web scraping can have legal and ethical implications. Always make sure you have permission to scrape a website, and comply with their robots.txt
file and terms of service.