Table of contents

How do I implement custom middleware or callbacks with HTTParty?

HTTParty provides several mechanisms for implementing custom middleware and callbacks to intercept and modify requests and responses. This functionality is essential for implementing logging, authentication, error handling, retry logic, and response transformation in your Ruby applications.

Understanding HTTParty Callbacks

HTTParty supports callbacks through several approaches:

  1. Class-level callbacks using before and after hooks
  2. Custom parser classes for response processing
  3. Connection adapters for request customization
  4. Custom formatters for request body formatting

Implementing Before and After Callbacks

Basic Callback Setup

The most straightforward way to implement callbacks is using the before and after class methods:

class ApiClient
  include HTTParty
  base_uri 'https://api.example.com'

  # Before callback - executed before each request
  before do |request|
    puts "Making request to: #{request.uri}"
    puts "Request headers: #{request.headers}"

    # Add authentication header
    request.headers['Authorization'] = "Bearer #{get_auth_token}"

    # Log request body for debugging
    puts "Request body: #{request.body}" if request.body
  end

  # After callback - executed after each response
  after do |request, response|
    puts "Response status: #{response.code}"
    puts "Response time: #{response.headers['X-Response-Time']}"

    # Log errors
    if response.code >= 400
      Rails.logger.error "API Error: #{response.code} - #{response.body}"
    end
  end

  private

  def self.get_auth_token
    # Your token retrieval logic
    ENV['API_TOKEN']
  end
end

Advanced Request Modification

You can modify requests extensively in before callbacks:

class EnhancedApiClient
  include HTTParty
  base_uri 'https://api.example.com'

  before do |request|
    # Add request ID for tracking
    request_id = SecureRandom.uuid
    request.headers['X-Request-ID'] = request_id

    # Add timestamp
    request.headers['X-Timestamp'] = Time.current.iso8601

    # Modify query parameters
    if request.uri.query
      query_params = CGI.parse(request.uri.query)
      query_params['client_version'] = ['1.0.0']
      request.uri.query = URI.encode_www_form(query_params.flat_map { |k, v| v.map { |val| [k, val] } })
    end

    # Add user agent
    request.headers['User-Agent'] = 'MyApp/1.0.0 (Ruby HTTParty)'

    # Log for debugging
    puts "[#{request_id}] #{request.http_method.name.upcase} #{request.uri}"
  end

  after do |request, response|
    request_id = request.headers['X-Request-ID']
    duration = response.headers['X-Response-Time'] || 'unknown'

    puts "[#{request_id}] Response: #{response.code} (#{duration})"

    # Handle rate limiting
    if response.code == 429
      retry_after = response.headers['Retry-After']
      puts "[#{request_id}] Rate limited. Retry after: #{retry_after} seconds"
    end
  end
end

Custom Response Parsers

Implement custom parsers to transform response data automatically:

class CustomJsonParser < HTTParty::Parser
  SupportedFormats = {'application/json' => :json, 'text/json' => :json}.freeze

  def parse
    case format
    when :json
      parsed_json = JSON.parse(body)

      # Transform response structure
      if parsed_json.is_a?(Hash)
        {
          data: parsed_json,
          metadata: {
            parsed_at: Time.current,
            response_size: body.length,
            headers: @response.headers.to_h
          }
        }
      else
        parsed_json
      end
    else
      body
    end
  rescue JSON::ParserError => e
    # Handle malformed JSON
    {
      error: 'Invalid JSON response',
      raw_body: body,
      parse_error: e.message
    }
  end
end

class ApiClientWithCustomParser
  include HTTParty
  base_uri 'https://api.example.com'
  parser CustomJsonParser

  def self.get_user(id)
    response = get("/users/#{id}")

    if response.parsed_response.is_a?(Hash) && response.parsed_response[:error]
      raise StandardError, "Parse error: #{response.parsed_response[:parse_error]}"
    end

    response.parsed_response
  end
end

Implementing Middleware Pattern

Create a more sophisticated middleware system using modules:

module HTTPartyMiddleware
  module Logging
    def self.included(base)
      base.extend(ClassMethods)
    end

    module ClassMethods
      def with_logging(logger = Rails.logger)
        before do |request|
          start_time = Time.current
          request.instance_variable_set(:@start_time, start_time)

          logger.info "HTTParty Request: #{request.http_method.name.upcase} #{request.uri}"
          logger.debug "Headers: #{request.headers.inspect}"
          logger.debug "Body: #{request.body}" if request.body
        end

        after do |request, response|
          start_time = request.instance_variable_get(:@start_time)
          duration = ((Time.current - start_time) * 1000).round(2)

          logger.info "HTTParty Response: #{response.code} (#{duration}ms)"

          if response.code >= 400
            logger.error "Error Response Body: #{response.body}"
          else
            logger.debug "Response Body: #{response.body}"
          end
        end
      end
    end
  end

  module Authentication
    def self.included(base)
      base.extend(ClassMethods)
    end

    module ClassMethods
      def with_bearer_auth(token_proc)
        before do |request|
          token = token_proc.call
          request.headers['Authorization'] = "Bearer #{token}"
        end
      end

      def with_api_key_auth(key, header_name = 'X-API-Key')
        before do |request|
          request.headers[header_name] = key
        end
      end
    end
  end

  module RetryLogic
    def self.included(base)
      base.extend(ClassMethods)
    end

    module ClassMethods
      def with_retry(max_retries: 3, backoff: 1, retry_codes: [429, 502, 503, 504])
        define_singleton_method :perform_request_with_retry do |http_method, path, options = {}, &block|
          retries = 0

          begin
            response = perform_request_without_retry(http_method, path, options, &block)

            if retry_codes.include?(response.code) && retries < max_retries
              retries += 1
              sleep_time = backoff * (2 ** (retries - 1))

              puts "Request failed with #{response.code}, retrying in #{sleep_time} seconds (attempt #{retries}/#{max_retries})"
              sleep(sleep_time)

              retry
            end

            response
          rescue Net::TimeoutError, Net::OpenTimeout => e
            if retries < max_retries
              retries += 1
              sleep_time = backoff * (2 ** (retries - 1))

              puts "Request timed out, retrying in #{sleep_time} seconds (attempt #{retries}/#{max_retries})"
              sleep(sleep_time)

              retry
            else
              raise e
            end
          end
        end

        alias_method :perform_request_without_retry, :perform_request
        alias_method :perform_request, :perform_request_with_retry
      end
    end
  end
end

Using the Middleware

class AdvancedApiClient
  include HTTParty
  include HTTPartyMiddleware::Logging
  include HTTPartyMiddleware::Authentication
  include HTTPartyMiddleware::RetryLogic

  base_uri 'https://api.example.com'
  timeout 30

  # Configure middleware
  with_logging(Logger.new(STDOUT))
  with_bearer_auth(-> { AuthService.get_token })
  with_retry(max_retries: 3, backoff: 2, retry_codes: [429, 502, 503, 504])

  def self.get_users(page: 1, per_page: 20)
    get('/users', query: { page: page, per_page: per_page })
  end

  def self.create_user(user_data)
    post('/users', body: user_data.to_json, headers: { 'Content-Type' => 'application/json' })
  end
end

Custom Connection Adapters

For more advanced request customization, you can create custom connection adapters:

module CustomHTTPartyAdapter
  class EnhancedAdapter < HTTParty::ConnectionAdapter
    def call(uri, options)
      # Add custom SSL configuration
      if uri.scheme == 'https'
        options[:use_ssl] = true
        options[:verify_mode] = OpenSSL::SSL::VERIFY_PEER
        options[:ca_file] = Rails.root.join('config', 'ca-bundle.crt').to_s
      end

      # Add connection pooling
      options[:keep_alive_timeout] = 30
      options[:max_retries] = 0  # Handle retries in middleware

      # Custom timeouts based on endpoint
      if uri.path.include?('/upload')
        options[:read_timeout] = 300  # 5 minutes for uploads
      elsif uri.path.include?('/reports')
        options[:read_timeout] = 120  # 2 minutes for reports
      else
        options[:read_timeout] = 30   # Default 30 seconds
      end

      super(uri, options)
    end
  end
end

class ApiClientWithCustomAdapter
  include HTTParty
  base_uri 'https://api.example.com'
  connection_adapter CustomHTTPartyAdapter::EnhancedAdapter
end

Error Handling Middleware

Implement comprehensive error handling:

module HTTPartyMiddleware
  module ErrorHandling
    class APIError < StandardError
      attr_reader :response, :status_code

      def initialize(message, response)
        @response = response
        @status_code = response.code
        super(message)
      end
    end

    def self.included(base)
      base.extend(ClassMethods)
    end

    module ClassMethods
      def with_error_handling
        after do |request, response|
          case response.code
          when 400
            raise APIError.new("Bad Request: #{extract_error_message(response)}", response)
          when 401
            raise APIError.new("Unauthorized: Check your authentication credentials", response)
          when 403
            raise APIError.new("Forbidden: Insufficient permissions", response)
          when 404
            raise APIError.new("Not Found: #{request.uri}", response)
          when 422
            raise APIError.new("Validation Error: #{extract_validation_errors(response)}", response)
          when 429
            retry_after = response.headers['Retry-After']
            raise APIError.new("Rate Limited: Retry after #{retry_after} seconds", response)
          when 500..599
            raise APIError.new("Server Error (#{response.code}): #{extract_error_message(response)}", response)
          end
        end
      end

      private

      def extract_error_message(response)
        if response.headers['content-type']&.include?('application/json')
          parsed = JSON.parse(response.body)
          parsed['error'] || parsed['message'] || 'Unknown error'
        else
          response.body.truncate(100)
        end
      rescue JSON::ParserError
        response.body.truncate(100)
      end

      def extract_validation_errors(response)
        parsed = JSON.parse(response.body)
        if parsed['errors'].is_a?(Hash)
          parsed['errors'].map { |field, messages| "#{field}: #{Array(messages).join(', ')}" }.join('; ')
        else
          parsed['errors'] || parsed['message'] || 'Validation failed'
        end
      rescue JSON::ParserError
        'Validation failed'
      end
    end
  end
end

Performance Monitoring Middleware

Track performance metrics:

module HTTPartyMiddleware
  module Performance
    def self.included(base)
      base.extend(ClassMethods)
    end

    module ClassMethods
      def with_performance_monitoring(metrics_collector = nil)
        before do |request|
          request.instance_variable_set(:@performance_start, Process.clock_gettime(Process::CLOCK_MONOTONIC))
        end

        after do |request, response|
          start_time = request.instance_variable_get(:@performance_start)
          duration = Process.clock_gettime(Process::CLOCK_MONOTONIC) - start_time

          metrics = {
            method: request.http_method.name.upcase,
            uri: request.uri.to_s,
            status_code: response.code,
            duration_ms: (duration * 1000).round(2),
            response_size: response.body.length,
            timestamp: Time.current
          }

          # Send to metrics collector (e.g., StatsD, CloudWatch, etc.)
          metrics_collector&.record(metrics)

          # Log slow requests
          if duration > 5.0  # 5 seconds
            Rails.logger.warn "Slow API request: #{metrics[:method]} #{metrics[:uri]} took #{metrics[:duration_ms]}ms"
          end
        end
      end
    end
  end
end

JavaScript Equivalent with Axios Interceptors

For comparison, here's how you would implement similar middleware functionality in JavaScript using Axios:

import axios from 'axios';

// Create axios instance
const apiClient = axios.create({
  baseURL: 'https://api.example.com',
  timeout: 30000,
});

// Request interceptor (equivalent to HTTParty's before callback)
apiClient.interceptors.request.use(
  (config) => {
    // Add request ID for tracking
    const requestId = crypto.randomUUID();
    config.headers['X-Request-ID'] = requestId;

    // Add timestamp
    config.headers['X-Timestamp'] = new Date().toISOString();

    // Add authentication
    const token = getAuthToken();
    if (token) {
      config.headers['Authorization'] = `Bearer ${token}`;
    }

    // Log request
    console.log(`[${requestId}] ${config.method.toUpperCase()} ${config.url}`);

    // Store start time for performance monitoring
    config.metadata = { startTime: Date.now() };

    return config;
  },
  (error) => {
    console.error('Request interceptor error:', error);
    return Promise.reject(error);
  }
);

// Response interceptor (equivalent to HTTParty's after callback)
apiClient.interceptors.response.use(
  (response) => {
    const requestId = response.config.headers['X-Request-ID'];
    const duration = Date.now() - response.config.metadata.startTime;

    console.log(`[${requestId}] Response: ${response.status} (${duration}ms)`);

    // Handle rate limiting
    if (response.status === 429) {
      const retryAfter = response.headers['retry-after'];
      console.log(`[${requestId}] Rate limited. Retry after: ${retryAfter} seconds`);
    }

    return response;
  },
  (error) => {
    const requestId = error.config?.headers['X-Request-ID'];
    const status = error.response?.status;

    console.error(`[${requestId}] Error: ${status} - ${error.message}`);

    // Custom error handling
    if (status === 401) {
      // Handle unauthorized
      window.location.href = '/login';
    }

    return Promise.reject(error);
  }
);

Best Practices

  1. Keep callbacks lightweight: Avoid heavy processing in callbacks as they run on every request
  2. Handle exceptions: Always wrap callback code in exception handling to prevent request failures
  3. Use appropriate logging levels: Debug for detailed info, info for important events, error for failures
  4. Consider thread safety: If using instance variables, ensure thread safety in concurrent environments
  5. Test your middleware: Write unit tests for your custom middleware logic

When implementing complex web scraping workflows that require sophisticated request handling, you might also benefit from understanding how to handle browser sessions in Puppeteer for scenarios where HTTParty's capabilities need to be supplemented with browser automation.

For handling dynamic content and ensuring your requests wait for specific conditions, similar patterns to how to use the waitFor function in Puppeteer can be implemented in HTTParty middleware using retry logic and conditional response checking.

Conclusion

HTTParty's callback system provides powerful hooks for implementing custom middleware functionality. Whether you need simple logging, complex authentication flows, or sophisticated error handling, the combination of before/after callbacks, custom parsers, and middleware modules gives you the flexibility to build robust HTTP client solutions.

The middleware pattern allows you to compose different behaviors cleanly, making your code more maintainable and testable. Remember to handle edge cases, implement proper error handling, and monitor performance to ensure your middleware enhances rather than hinders your application's reliability.

Try WebScraping.AI for Your Web Scraping Needs

Looking for a powerful web scraping solution? WebScraping.AI provides an LLM-powered API that combines Chromium JavaScript rendering with rotating proxies for reliable data extraction.

Key Features:

  • AI-powered extraction: Ask questions about web pages or extract structured data fields
  • JavaScript rendering: Full Chromium browser support for dynamic content
  • Rotating proxies: Datacenter and residential proxies from multiple countries
  • Easy integration: Simple REST API with SDKs for Python, Ruby, PHP, and more
  • Reliable & scalable: Built for developers who need consistent results

Getting Started:

Get page content with AI analysis:

curl "https://api.webscraping.ai/ai/question?url=https://example.com&question=What is the main topic?&api_key=YOUR_API_KEY"

Extract structured data:

curl "https://api.webscraping.ai/ai/fields?url=https://example.com&fields[title]=Page title&fields[price]=Product price&api_key=YOUR_API_KEY"

Try in request builder

Related Questions

Get Started Now

WebScraping.AI provides rotating proxies, Chromium rendering and built-in HTML parser for web scraping
Icon