Yes, Reqwest provides multiple ways to customize connection pool settings for optimal performance and resource management. Connection pooling is crucial for web scraping and high-throughput applications as it reuses TCP connections across multiple HTTP requests.
Built-in Connection Pool Settings
Reqwest's ClientBuilder
offers several built-in configuration options:
use reqwest::Client;
use std::time::Duration;
#[tokio::main]
async fn main() -> Result<(), reqwest::Error> {
let client = Client::builder()
.pool_max_idle_per_host(50) // Max idle connections per host
.pool_idle_timeout(Duration::from_secs(90)) // Idle timeout before closing
.timeout(Duration::from_secs(30)) // Request timeout
.connect_timeout(Duration::from_secs(10)) // Connection timeout
.tcp_keepalive(Duration::from_secs(60)) // TCP keep-alive interval
.tcp_nodelay(true) // Disable Nagle's algorithm
.build()?;
// Make multiple requests - connections will be reused
for i in 0..5 {
let response = client
.get("https://httpbin.org/get")
.send()
.await?;
println!("Request {}: {}", i + 1, response.status());
}
Ok(())
}
Advanced Custom Connection Pool
For more control, you can use a custom connector with hyper:
use hyper_util::client::legacy::{Client as HyperClient, connect::HttpConnector};
use hyper_util::rt::TokioExecutor;
use reqwest::Client;
use std::time::Duration;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
// Configure HTTP connector
let mut connector = HttpConnector::new();
connector.enforce_http(false);
connector.set_keepalive(Some(Duration::from_secs(75)));
connector.set_nodelay(true);
connector.set_connect_timeout(Some(Duration::from_secs(10)));
// Build hyper client with custom settings
let hyper_client = HyperClient::builder(TokioExecutor::new())
.pool_idle_timeout(Duration::from_secs(30))
.pool_max_idle_per_host(25)
.build(connector);
// Create reqwest client with custom hyper client
let client = reqwest::Client::builder()
.timeout(Duration::from_secs(30))
.build()?;
let response = client
.get("https://httpbin.org/delay/1")
.send()
.await?;
println!("Response: {}", response.status());
Ok(())
}
Connection Pool for Web Scraping
Here's a practical example optimized for web scraping scenarios:
use reqwest::{Client, header::{USER_AGENT, HeaderMap, HeaderValue}};
use std::time::Duration;
use tokio::time::sleep;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
// Configure headers
let mut headers = HeaderMap::new();
headers.insert(USER_AGENT, HeaderValue::from_static(
"Mozilla/5.0 (compatible; WebScraper/1.0)"
));
let client = Client::builder()
.pool_max_idle_per_host(10) // Reasonable for scraping
.pool_idle_timeout(Duration::from_secs(30)) // Quick cleanup
.timeout(Duration::from_secs(20)) // Reasonable timeout
.connect_timeout(Duration::from_secs(5)) // Fast connection attempt
.tcp_keepalive(Duration::from_secs(60)) // Keep connections alive
.default_headers(headers)
.gzip(true) // Enable compression
.build()?;
let urls = vec![
"https://httpbin.org/json",
"https://httpbin.org/user-agent",
"https://httpbin.org/headers",
];
for url in urls {
let response = client.get(url).send().await?;
println!("URL: {} - Status: {}", url, response.status());
// Rate limiting - be respectful
sleep(Duration::from_millis(500)).await;
}
Ok(())
}
Connection Pool Monitoring
Monitor your connection pool usage for optimal performance:
use reqwest::Client;
use std::sync::Arc;
use tokio::sync::Semaphore;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let client = Client::builder()
.pool_max_idle_per_host(20)
.build()?;
// Limit concurrent requests to avoid overwhelming the target
let semaphore = Arc::new(Semaphore::new(5));
let mut tasks = vec![];
for i in 0..20 {
let client = client.clone();
let sem = semaphore.clone();
let task = tokio::spawn(async move {
let _permit = sem.acquire().await.unwrap();
let response = client
.get(&format!("https://httpbin.org/delay/{}", i % 3))
.send()
.await?;
println!("Task {}: Status {}", i, response.status());
Ok::<(), reqwest::Error>(())
});
tasks.push(task);
}
// Wait for all tasks to complete
for task in tasks {
let _ = task.await;
}
Ok(())
}
Key Configuration Options
| Setting | Purpose | Recommended Value |
|---------|---------|-------------------|
| pool_max_idle_per_host
| Maximum idle connections per host | 10-50 (based on load) |
| pool_idle_timeout
| Time before closing idle connections | 30-90 seconds |
| timeout
| Overall request timeout | 10-30 seconds |
| connect_timeout
| Connection establishment timeout | 5-10 seconds |
| tcp_keepalive
| TCP keep-alive interval | 60 seconds |
| tcp_nodelay
| Disable Nagle's algorithm | true
for low latency |
Best Practices
- Start Conservative: Begin with default settings and adjust based on monitoring
- Monitor Resource Usage: Track memory and file descriptor usage
- Consider Target Limits: Respect server rate limits and connection policies
- Test Under Load: Validate settings under expected traffic patterns
- Use Connection Reuse: Keep clients alive for the lifetime of your application
Connection pool optimization significantly improves performance for applications making multiple HTTP requests, especially in web scraping scenarios where you're fetching data from the same hosts repeatedly.