Table of contents

How do I handle gzip or deflate compressed responses in Reqwest?

By default, reqwest automatically handles gzip and deflate compression, transparently decompressing response bodies for you. However, there are scenarios where you might need manual control over decompression, such as streaming large files, analyzing raw compressed data, or implementing custom caching strategies.

Default Automatic Decompression

Reqwest automatically decompresses responses when the appropriate features are enabled:

[dependencies]
reqwest = { version = "0.11", features = ["gzip", "deflate"] }
use reqwest;

#[tokio::main]
async fn main() -> Result<(), reqwest::Error> {
    let response = reqwest::get("https://httpbin.org/gzip").await?;
    let text = response.text().await?; // Automatically decompressed
    println!("{}", text);
    Ok(())
}

Manual Decompression Control

To manually handle compressed responses, you need to disable automatic decompression and use the flate2 crate:

Dependencies

[dependencies]
reqwest = { version = "0.11", default-features = false, features = ["json", "stream"] }
flate2 = "1.0"
tokio = { version = "1.0", features = ["full"] }

Async Implementation

use reqwest;
use flate2::read::{GzDecoder, DeflateDecoder};
use std::io::Read;

async fn fetch_and_decompress(url: &str) -> Result<Vec<u8>, Box<dyn std::error::Error>> {
    // Build client with disabled automatic decompression
    let client = reqwest::Client::builder()
        .gzip(false)
        .deflate(false)
        .build()?;

    // Make request
    let response = client.get(url).send().await?;

    // Check Content-Encoding header
    let content_encoding = response
        .headers()
        .get(reqwest::header::CONTENT_ENCODING)
        .and_then(|v| v.to_str().ok());

    // Get raw bytes (not text to avoid encoding issues)
    let body_bytes = response.bytes().await?;
    let mut decompressed_data = Vec::new();

    match content_encoding {
        Some("gzip") => {
            let mut decoder = GzDecoder::new(&body_bytes[..]);
            decoder.read_to_end(&mut decompressed_data)?;
        },
        Some("deflate") => {
            let mut decoder = DeflateDecoder::new(&body_bytes[..]);
            decoder.read_to_end(&mut decompressed_data)?;
        },
        Some("br") => {
            // Brotli compression (requires brotli crate)
            return Err("Brotli decompression not implemented in this example".into());
        },
        _ => {
            // No compression or unsupported encoding
            decompressed_data = body_bytes.to_vec();
        }
    }

    Ok(decompressed_data)
}

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let url = "https://httpbin.org/gzip";
    let data = fetch_and_decompress(url).await?;

    // Convert to string if it's text data
    let text = String::from_utf8(data)?;
    println!("Decompressed content: {}", text);

    Ok(())
}

Blocking Implementation

use reqwest::blocking;
use flate2::read::{GzDecoder, DeflateDecoder};
use std::io::Read;

fn fetch_and_decompress_blocking(url: &str) -> Result<Vec<u8>, Box<dyn std::error::Error>> {
    let client = blocking::Client::builder()
        .gzip(false)
        .deflate(false)
        .build()?;

    let response = client.get(url).send()?;
    let content_encoding = response
        .headers()
        .get(reqwest::header::CONTENT_ENCODING)
        .and_then(|v| v.to_str().ok());

    let body_bytes = response.bytes()?;
    let mut decompressed_data = Vec::new();

    match content_encoding {
        Some("gzip") => {
            let mut decoder = GzDecoder::new(&body_bytes[..]);
            decoder.read_to_end(&mut decompressed_data)?;
        },
        Some("deflate") => {
            let mut decoder = DeflateDecoder::new(&body_bytes[..]);
            decoder.read_to_end(&mut decompressed_data)?;
        },
        _ => {
            decompressed_data = body_bytes.to_vec();
        }
    }

    Ok(decompressed_data)
}

Streaming Decompression

For large responses, you can stream and decompress data incrementally:

use reqwest;
use flate2::read::GzDecoder;
use tokio::io::{AsyncReadExt, BufReader};
use std::io::Read;

async fn stream_decompress(url: &str) -> Result<Vec<u8>, Box<dyn std::error::Error>> {
    let client = reqwest::Client::builder()
        .gzip(false)
        .build()?;

    let response = client.get(url).send().await?;
    let content_encoding = response
        .headers()
        .get(reqwest::header::CONTENT_ENCODING)
        .and_then(|v| v.to_str().ok());

    if content_encoding == Some("gzip") {
        let bytes = response.bytes().await?;
        let mut decoder = GzDecoder::new(&bytes[..]);
        let mut decompressed = Vec::new();
        decoder.read_to_end(&mut decompressed)?;
        Ok(decompressed)
    } else {
        Ok(response.bytes().await?.to_vec())
    }
}

When to Use Manual Decompression

Manual decompression is useful when you need to:

  • Analyze compression ratios: Compare compressed vs decompressed sizes
  • Implement custom caching: Cache compressed data to save storage
  • Stream processing: Decompress large files incrementally
  • Error handling: Provide specific handling for compression-related errors
  • Metrics collection: Track compression performance

Error Handling Best Practices

use reqwest;
use flate2::read::GzDecoder;
use std::io::Read;

#[derive(Debug)]
enum DecompressionError {
    NetworkError(reqwest::Error),
    CompressionError(std::io::Error),
    UnsupportedEncoding(String),
}

impl std::fmt::Display for DecompressionError {
    fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result {
        match self {
            DecompressionError::NetworkError(e) => write!(f, "Network error: {}", e),
            DecompressionError::CompressionError(e) => write!(f, "Compression error: {}", e),
            DecompressionError::UnsupportedEncoding(e) => write!(f, "Unsupported encoding: {}", e),
        }
    }
}

impl std::error::Error for DecompressionError {}

async fn robust_fetch_and_decompress(url: &str) -> Result<Vec<u8>, DecompressionError> {
    let client = reqwest::Client::builder()
        .gzip(false)
        .deflate(false)
        .build()
        .map_err(DecompressionError::NetworkError)?;

    let response = client.get(url).send().await
        .map_err(DecompressionError::NetworkError)?;

    let content_encoding = response
        .headers()
        .get(reqwest::header::CONTENT_ENCODING)
        .and_then(|v| v.to_str().ok());

    let body_bytes = response.bytes().await
        .map_err(DecompressionError::NetworkError)?;

    let mut decompressed_data = Vec::new();

    match content_encoding {
        Some("gzip") => {
            let mut decoder = GzDecoder::new(&body_bytes[..]);
            decoder.read_to_end(&mut decompressed_data)
                .map_err(DecompressionError::CompressionError)?;
        },
        Some("deflate") => {
            let mut decoder = flate2::read::DeflateDecoder::new(&body_bytes[..]);
            decoder.read_to_end(&mut decompressed_data)
                .map_err(DecompressionError::CompressionError)?;
        },
        Some(encoding) => {
            return Err(DecompressionError::UnsupportedEncoding(encoding.to_string()));
        },
        None => {
            decompressed_data = body_bytes.to_vec();
        }
    }

    Ok(decompressed_data)
}

Performance Considerations

  • Memory usage: Manual decompression loads the entire response into memory
  • CPU overhead: Decompression is CPU-intensive for large files
  • Error resilience: Manual handling allows better error recovery
  • Compatibility: Some servers may not handle disabled auto-decompression well

For most applications, reqwest's automatic decompression is sufficient and recommended. Use manual decompression only when you have specific requirements that automatic handling cannot meet.

Try WebScraping.AI for Your Web Scraping Needs

Looking for a powerful web scraping solution? WebScraping.AI provides an LLM-powered API that combines Chromium JavaScript rendering with rotating proxies for reliable data extraction.

Key Features:

  • AI-powered extraction: Ask questions about web pages or extract structured data fields
  • JavaScript rendering: Full Chromium browser support for dynamic content
  • Rotating proxies: Datacenter and residential proxies from multiple countries
  • Easy integration: Simple REST API with SDKs for Python, Ruby, PHP, and more
  • Reliable & scalable: Built for developers who need consistent results

Getting Started:

Get page content with AI analysis:

curl "https://api.webscraping.ai/ai/question?url=https://example.com&question=What is the main topic?&api_key=YOUR_API_KEY"

Extract structured data:

curl "https://api.webscraping.ai/ai/fields?url=https://example.com&fields[title]=Page title&fields[price]=Product price&api_key=YOUR_API_KEY"

Try in request builder

Related Questions

Get Started Now

WebScraping.AI provides rotating proxies, Chromium rendering and built-in HTML parser for web scraping
Icon