What is the difference between reqwest and hyper for web scraping in Rust?

When building web scrapers in Rust, choosing the right HTTP client is crucial for performance, maintainability, and ease of development. The two most popular options are reqwest and hyper, each serving different needs and use cases. Understanding their differences will help you make an informed decision for your web scraping projects.

Overview of reqwest and hyper

Hyper is a low-level, fast HTTP implementation that serves as the foundation for many Rust HTTP libraries. It's designed for maximum performance and flexibility but requires more boilerplate code.

Reqwest is a high-level HTTP client built on top of hyper that provides a more user-friendly API similar to Python's requests library. It abstracts away much of the complexity while maintaining good performance.

Key Differences

1. Ease of Use

Reqwest wins hands-down for developer experience:

use reqwest;
use std::error::Error;

#[tokio::main]
async fn main() -> Result<(), Box<dyn Error>> {
    // Simple GET request with reqwest
    let response = reqwest::get("https://httpbin.org/json")
        .await?
        .text()
        .await?;

    println!("Response: {}", response);
    Ok(())
}

Hyper requires more setup and boilerplate:

use hyper::{Body, Client, Request, Uri};
use hyper_tls::HttpsConnector;
use std::error::Error;

#[tokio::main]
async fn main() -> Result<(), Box<dyn Error>> {
    // Setup HTTPS connector
    let https = HttpsConnector::new();
    let client = Client::builder().build::<_, Body>(https);

    // Create request
    let uri: Uri = "https://httpbin.org/json".parse()?;
    let req = Request::builder()
        .method("GET")
        .uri(uri)
        .body(Body::empty())?;

    // Send request
    let resp = client.request(req).await?;
    let body_bytes = hyper::body::to_bytes(resp.into_body()).await?;
    let body = String::from_utf8(body_bytes.to_vec())?;

    println!("Response: {}", body);
    Ok(())
}

2. Performance Characteristics

Hyper offers superior performance for high-throughput scenarios:

Lower memory overhead
Faster request/response cycles
Better connection pooling control
Minimal abstraction layers

Reqwest provides good performance with convenience:

Built on hyper's performance foundation
Automatic connection pooling
Slightly higher memory usage due to abstractions
Excellent for most web scraping scenarios

3. Feature Set Comparison

| Feature | Reqwest | Hyper | |---------|---------|-------| | JSON handling | ✅ Built-in | ❌ Manual | | Cookie support | ✅ Automatic | ❌ Manual | | Redirects | ✅ Automatic | ❌ Manual | | Proxy support | ✅ Built-in | ❌ Manual | | Form data | ✅ Easy API | ❌ Manual | | Compression | ✅ Automatic | ❌ Manual | | Timeouts | ✅ Simple config | ❌ Manual | | HTTP/2 | ✅ Automatic | ✅ Yes |

Practical Web Scraping Examples

Scraping with Authentication (reqwest)

use reqwest::{Client, header};
use serde_json::Value;
use std::error::Error;

#[tokio::main]
async fn main() -> Result<(), Box<dyn Error>> {
    let client = Client::builder()
        .user_agent("Mozilla/5.0 (compatible; WebScraper/1.0)")
        .timeout(std::time::Duration::from_secs(30))
        .build()?;

    // Scrape with custom headers
    let response: Value = client
        .get("https://api.github.com/user")
        .header(header::AUTHORIZATION, "token YOUR_TOKEN")
        .send()
        .await?
        .json()
        .await?;

    println!("User data: {:#}", response);
    Ok(())
}

Session Management for Login-Based Scraping

use reqwest::{Client, cookie::Jar};
use std::sync::Arc;
use url::Url;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Create cookie jar for session management
    let jar = Arc::new(Jar::default());
    let client = Client::builder()
        .cookie_provider(jar.clone())
        .build()?;

    // Login request
    let login_data = [("username", "user"), ("password", "pass")];
    client
        .post("https://example.com/login")
        .form(&login_data)
        .send()
        .await?;

    // Subsequent authenticated requests
    let protected_content = client
        .get("https://example.com/protected")
        .send()
        .await?
        .text()
        .await?;

    println!("Protected content: {}", protected_content);
    Ok(())
}

High-Performance Concurrent Scraping

use reqwest::Client;
use tokio::time::{sleep, Duration};
use futures::future::join_all;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let client = Client::new();
    let urls = vec![
        "https://httpbin.org/delay/1",
        "https://httpbin.org/delay/2", 
        "https://httpbin.org/delay/3",
    ];

    // Create concurrent requests
    let requests = urls.into_iter().map(|url| {
        let client = client.clone();
        async move {
            // Add delay to respect rate limits
            sleep(Duration::from_millis(100)).await;

            client.get(url)
                .send()
                .await?
                .text()
                .await
        }
    });

    // Execute all requests concurrently
    let results = join_all(requests).await;

    for (i, result) in results.into_iter().enumerate() {
        match result {
            Ok(content) => println!("Request {}: Success", i),
            Err(e) => eprintln!("Request {}: Error - {}", i, e),
        }
    }

    Ok(())
}

Error Handling and Timeout Management

Reqwest Error Handling

use reqwest::{Client, Error as ReqwestError};
use std::time::Duration;

#[tokio::main]
async fn main() {
    let client = Client::builder()
        .timeout(Duration::from_secs(10))
        .build()
        .unwrap();

    match client.get("https://httpbin.org/status/404").send().await {
        Ok(response) => {
            if response.status().is_success() {
                println!("Success: {}", response.text().await.unwrap());
            } else {
                println!("HTTP Error: {}", response.status());
            }
        }
        Err(e) => {
            if e.is_timeout() {
                println!("Request timed out");
            } else if e.is_connect() {
                println!("Connection failed");
            } else {
                println!("Other error: {}", e);
            }
        }
    }
}

Hyper with Custom Error Handling

use hyper::{Client, Request, Body, StatusCode};
use hyper_tls::HttpsConnector;
use std::time::Duration;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let https = HttpsConnector::new();
    let client = Client::builder()
        .pool_idle_timeout(Duration::from_secs(30))
        .build::<_, Body>(https);

    let req = Request::builder()
        .method("GET")
        .uri("https://httpbin.org/status/404")
        .body(Body::empty())?;

    let resp = client.request(req).await?;

    match resp.status() {
        StatusCode::OK => {
            let body = hyper::body::to_bytes(resp.into_body()).await?;
            println!("Success: {}", String::from_utf8_lossy(&body));
        }
        StatusCode::NOT_FOUND => {
            println!("Resource not found");
        }
        status => {
            println!("Unexpected status: {}", status);
        }
    }

    Ok(())
}

Cargo.toml Dependencies

For a typical web scraping project, here are the dependencies you'll need:

Reqwest Setup

[dependencies]
reqwest = { version = "0.11", features = ["json", "cookies"] }
tokio = { version = "1.0", features = ["full"] }
serde_json = "1.0"

Hyper Setup

[dependencies]
hyper = { version = "0.14", features = ["client", "http1", "http2"] }
hyper-tls = "0.5"
tokio = { version = "1.0", features = ["full"] }
serde_json = "1.0"

When to Choose Each Library

Choose Reqwest When:

Rapid development is prioritized
Building typical web scrapers with standard requirements
You need built-in features like JSON parsing, cookies, redirects
Your team has mixed experience levels with Rust
Maintenance simplicity is important
Working with APIs that require authentication
Scraping sites that need session management

Choose Hyper When:

Maximum performance is critical
Building high-throughput systems (thousands of requests/second)
You need fine-grained control over HTTP behavior
Memory usage must be minimized
Building custom HTTP tooling or proxies
You're experienced with low-level HTTP handling
Working with custom protocols or non-standard HTTP usage

Performance Benchmarks

In typical web scraping scenarios:

Reqwest: ~2000-5000 requests/second (depending on response size)
Hyper: ~5000-10000 requests/second (with proper optimization)

However, for most web scraping projects, the difference is negligible compared to network latency and target server response times. Similar to how handling timeouts is crucial in browser automation, proper timeout configuration is more important than raw performance for most scraping tasks.

Migration Considerations

If you start with reqwest and later need hyper's performance, migration is possible but requires significant code changes. It's often better to start with reqwest for prototyping and only move to hyper if performance profiling shows it's necessary.

For JavaScript developers transitioning to Rust, reqwest's API will feel more familiar, similar to fetch() or axios, while hyper requires understanding Rust's lower-level HTTP concepts.

Conclusion

For most web scraping projects in Rust, reqwest is the recommended choice due to its excellent balance of performance, features, and developer experience. Its built-in support for common scraping needs like cookies, redirects, and JSON parsing makes it ideal for rapid development.

Choose hyper only when you have specific performance requirements that reqwest cannot meet, or when you need fine-grained control over HTTP behavior that reqwest's abstractions don't provide.

Both libraries are actively maintained and production-ready, so your choice should primarily depend on your project's specific requirements and your team's expertise level with Rust's HTTP ecosystem.

Table of contents