Yes, Colly can be used to submit forms on websites. While Colly is primarily designed for web scraping, it provides robust capabilities for form submissions through HTTP POST and GET requests, making it suitable for automating web interactions in Go applications.
How Colly Handles Form Submissions
Colly submits forms by: 1. Parsing form elements from HTML pages 2. Extracting form attributes (action URL, method, field names) 3. Sending HTTP requests with form data 4. Managing cookies and sessions automatically
Basic Form Submission Example
Here's a complete example demonstrating form submission with Colly:
package main
import (
"fmt"
"log"
"net/url"
"github.com/gocolly/colly/v2"
)
func main() {
c := colly.NewCollector()
// Handle form submission
c.OnHTML("form[action='/login']", func(e *colly.HTMLElement) {
// Get form action URL
actionURL := e.Attr("action")
// Prepare form data
formData := url.Values{}
formData.Set("username", "your_username")
formData.Set("password", "your_password")
// Submit form via POST request
err := c.Post(e.Request.AbsoluteURL(actionURL), formData)
if err != nil {
log.Printf("Form submission failed: %v", err)
}
})
// Handle successful response
c.OnResponse(func(r *colly.Response) {
fmt.Printf("Status: %d\n", r.StatusCode)
if r.StatusCode == 200 {
fmt.Println("Form submitted successfully!")
}
})
// Handle errors
c.OnError(func(r *colly.Response, err error) {
log.Printf("Error: %s\n", err.Error())
})
// Visit the login page
err := c.Visit("https://example.com/login")
if err != nil {
log.Fatal(err)
}
}
Advanced Form Handling with CSRF Tokens
Many modern websites use CSRF tokens for security. Here's how to handle them:
package main
import (
"fmt"
"log"
"net/url"
"github.com/gocolly/colly/v2"
)
func main() {
c := colly.NewCollector()
var csrfToken string
// Extract CSRF token first
c.OnHTML("input[name='_token']", func(e *colly.HTMLElement) {
csrfToken = e.Attr("value")
fmt.Printf("CSRF Token found: %s\n", csrfToken)
})
// Submit form with CSRF token
c.OnHTML("form#contact-form", func(e *colly.HTMLElement) {
actionURL := e.Attr("action")
formData := url.Values{}
formData.Set("_token", csrfToken)
formData.Set("name", "John Doe")
formData.Set("email", "john@example.com")
formData.Set("message", "Hello from Colly!")
err := c.Post(e.Request.AbsoluteURL(actionURL), formData)
if err != nil {
log.Printf("Form submission failed: %v", err)
}
})
c.OnResponse(func(r *colly.Response) {
fmt.Printf("Response status: %d\n", r.StatusCode)
})
err := c.Visit("https://example.com/contact")
if err != nil {
log.Fatal(err)
}
}
Multi-Step Form Submission
For complex workflows involving multiple forms or steps:
package main
import (
"fmt"
"log"
"net/url"
"github.com/gocolly/colly/v2"
)
func main() {
c := colly.NewCollector()
// Step 1: Login form
c.OnHTML("form[action='/auth/login']", func(e *colly.HTMLElement) {
actionURL := e.Attr("action")
formData := url.Values{}
formData.Set("username", "user123")
formData.Set("password", "password123")
c.Post(e.Request.AbsoluteURL(actionURL), formData)
})
// Step 2: Profile update form (after successful login)
c.OnHTML("form[action='/profile/update']", func(e *colly.HTMLElement) {
actionURL := e.Attr("action")
formData := url.Values{}
formData.Set("first_name", "John")
formData.Set("last_name", "Doe")
formData.Set("bio", "Software Developer")
c.Post(e.Request.AbsoluteURL(actionURL), formData)
})
// Handle redirects and responses
c.OnResponse(func(r *colly.Response) {
fmt.Printf("Visited: %s (Status: %d)\n", r.Request.URL, r.StatusCode)
// Check if we need to visit the profile page after login
if r.Request.URL.Path == "/dashboard" {
c.Visit(r.Request.AbsoluteURL("/profile"))
}
})
// Start the process
c.Visit("https://example.com/login")
}
File Upload Forms
Colly can also handle file uploads:
package main
import (
"bytes"
"fmt"
"io"
"log"
"mime/multipart"
"os"
"github.com/gocolly/colly/v2"
)
func submitFileUpload(c *colly.Collector, actionURL, filePath string) error {
// Open the file
file, err := os.Open(filePath)
if err != nil {
return err
}
defer file.Close()
// Create multipart form
var body bytes.Buffer
writer := multipart.NewWriter(&body)
// Add file field
part, err := writer.CreateFormFile("upload", "document.pdf")
if err != nil {
return err
}
_, err = io.Copy(part, file)
if err != nil {
return err
}
// Add other form fields
writer.WriteField("description", "Important document")
writer.Close()
// Submit with custom headers
return c.PostRaw(actionURL, body.Bytes(), map[string]string{
"Content-Type": writer.FormDataContentType(),
})
}
Best Practices and Considerations
1. Session Management
// Enable cookie handling for session persistence
c := colly.NewCollector()
c.OnRequest(func(r *colly.Request) {
r.Headers.Set("User-Agent", "Mozilla/5.0 (compatible; Colly)")
})
2. Rate Limiting
// Add delays to avoid overwhelming servers
c.Limit(&colly.LimitRule{
DomainGlob: "*",
Parallelism: 1,
Delay: 2 * time.Second,
})
3. Error Handling
c.OnError(func(r *colly.Response, err error) {
if r != nil {
log.Printf("Failed to submit form to %s: Status %d, Error: %v",
r.Request.URL, r.StatusCode, err)
} else {
log.Printf("Request failed: %v", err)
}
})
Limitations and Alternatives
Colly Limitations: - No JavaScript execution (SPA forms may not work) - Cannot handle complex client-side validations - Limited support for dynamic form fields
When to Use Alternatives: - Chromedp or Rod: For JavaScript-heavy forms - Playwright-go: For complex browser automation - HTTP clients: For simple API-based form submissions
Security and Ethics
- Check robots.txt and terms of service
- Respect rate limits and server resources
- Handle CSRF tokens properly for security
- Use appropriate user agents to identify your bot
- Implement proper error handling and retries
Colly provides excellent form submission capabilities for most web scraping scenarios, making it a powerful tool for automating web interactions in Go applications.