HTTParty is a popular Ruby gem for making HTTP requests. When scraping HTTPS websites, you may need to configure custom SSL settings to handle self-signed certificates, client certificate authentication, or specific security requirements.
Basic SSL Configuration
HTTParty provides several SSL options that can be passed as parameters to HTTP methods:
1. Disabling SSL Verification (Development Only)
⚠️ Warning: Only use this for development or testing. Never disable SSL verification in production.
require 'httparty'
# Disable SSL verification for testing
response = HTTParty.get('https://self-signed.example.com', verify: false)
puts response.body
2. Custom CA Certificate Files
When dealing with self-signed certificates or custom Certificate Authorities:
require 'httparty'
# Single CA certificate file
options = {
ssl_ca_file: '/path/to/ca_certificate.pem'
}
response = HTTParty.get('https://custom-ca.example.com', options)
# Multiple CA certificates directory
options = {
ssl_ca_path: '/path/to/ca_certificates_directory/'
}
response = HTTParty.get('https://custom-ca.example.com', options)
3. Client Certificate Authentication
For servers requiring client certificates (mutual TLS):
require 'httparty'
# Using separate certificate and key files
options = {
pem: File.read('/path/to/client_cert.pem'),
pem_password: 'certificate_password', # If certificate is password-protected
verify: true
}
response = HTTParty.get('https://client-cert.example.com', options)
# Alternative: using separate cert and key
options = {
cert: OpenSSL::X509::Certificate.new(File.read('/path/to/client.crt')),
key: OpenSSL::PKey::RSA.new(File.read('/path/to/client.key'), 'key_password'),
verify: true
}
response = HTTParty.get('https://client-cert.example.com', options)
Advanced SSL Configuration
TLS Version and Cipher Selection
require 'httparty'
options = {
ssl_version: :TLSv1_3, # Force specific TLS version
ciphers: [
'ECDHE-RSA-AES256-GCM-SHA384',
'ECDHE-RSA-AES128-GCM-SHA256'
],
verify: true
}
response = HTTParty.get('https://secure.example.com', options)
SSL Timeout Configuration
require 'httparty'
options = {
ssl_timeout: 30, # SSL handshake timeout in seconds
verify: true
}
response = HTTParty.get('https://slow-ssl.example.com', options)
Class-Based Configuration
For consistent SSL settings across multiple requests:
require 'httparty'
class SecureScraper
include HTTParty
# Set default SSL options for all requests
default_options.update({
verify: true,
ssl_ca_file: '/path/to/custom_ca.pem',
ssl_version: :TLSv1_2,
timeout: 30
})
base_uri 'https://api.example.com'
headers 'User-Agent' => 'SecureScraper/1.0'
end
# All requests will use the configured SSL settings
response = SecureScraper.get('/data')
Environment-Specific Configuration
require 'httparty'
class FlexibleScraper
include HTTParty
def self.configure_ssl_for_environment
if Rails.env.development?
# Relaxed settings for development
default_options.update(verify: false)
elsif Rails.env.production?
# Strict settings for production
default_options.update({
verify: true,
ssl_version: :TLSv1_3,
ssl_ca_file: Rails.root.join('config', 'ca-bundle.pem').to_s
})
end
end
end
FlexibleScraper.configure_ssl_for_environment
Error Handling
Always handle SSL-related errors appropriately:
require 'httparty'
begin
response = HTTParty.get('https://example.com', {
verify: true,
ssl_ca_file: '/path/to/ca.pem'
})
puts response.body
rescue OpenSSL::SSL::SSLError => e
puts "SSL Error: #{e.message}"
# Handle SSL verification failures
rescue HTTParty::Error => e
puts "HTTP Error: #{e.message}"
end
Security Best Practices
- Always verify certificates in production - Set
verify: true
- Use strong TLS versions - Prefer TLS 1.2 or 1.3
- Keep CA certificates updated - Regularly update certificate bundles
- Secure certificate storage - Store certificates securely, never in version control
- Monitor certificate expiration - Implement alerts for expiring certificates
- Use environment variables - Store sensitive paths and passwords in environment variables
# Secure configuration example
options = {
verify: true,
ssl_ca_file: ENV['SSL_CA_FILE_PATH'],
ssl_version: :TLSv1_3,
pem: File.read(ENV['CLIENT_CERT_PATH']),
pem_password: ENV['CLIENT_CERT_PASSWORD']
}
Common SSL Options Summary
| Option | Description | Example |
|--------|-------------|---------|
| verify
| Enable/disable SSL verification | true
/false
|
| ssl_ca_file
| Path to CA certificate file | '/path/to/ca.pem'
|
| ssl_ca_path
| Directory containing CA certificates | '/etc/ssl/certs/'
|
| pem
| Client certificate in PEM format | File.read('cert.pem')
|
| pem_password
| Password for encrypted certificate | 'password123'
|
| ssl_version
| Specific TLS version | :TLSv1_3
|
| ciphers
| Allowed cipher suites | ['ECDHE-RSA-AES256-GCM-SHA384']
|
| ssl_timeout
| SSL handshake timeout | 30
|
Remember: SSL configuration directly impacts both security and compatibility. Always test thoroughly and follow security best practices for production deployments.