How Can I Implement Custom Response Parsing with HTTParty?
HTTParty provides several powerful mechanisms for implementing custom response parsing, allowing you to process API responses in ways that go beyond the default JSON and XML parsing. Whether you're dealing with custom data formats, need specialized validation, or want to transform responses into domain-specific objects, HTTParty's flexible parsing system has you covered.
Understanding HTTParty's Response Parsing System
HTTParty automatically detects and parses common response formats like JSON and XML based on the Content-Type
header. However, when working with APIs that return custom formats or when you need specialized processing, you'll need to implement custom parsing logic.
Method 1: Using Custom Parser Classes
The most robust approach is to create custom parser classes that implement HTTParty's parser interface. This method gives you complete control over how responses are processed.
Creating a Custom Parser
# Custom CSV parser
class CSVParser < HTTParty::Parser
SupportedFormats = {
'text/csv' => :csv,
'application/csv' => :csv
}.freeze
def csv
require 'csv'
CSV.parse(body, headers: true, header_converters: :symbol)
end
end
# Custom XML parser with additional processing
class CustomXMLParser < HTTParty::Parser
SupportedFormats = {
'application/xml' => :custom_xml,
'text/xml' => :custom_xml
}.freeze
def custom_xml
require 'nokogiri'
doc = Nokogiri::XML(body)
# Custom processing logic
{
metadata: extract_metadata(doc),
items: extract_items(doc),
parsed_at: Time.current
}
end
private
def extract_metadata(doc)
{
version: doc.at('//version')&.text,
timestamp: doc.at('//timestamp')&.text
}
end
def extract_items(doc)
doc.xpath('//item').map do |item|
{
id: item['id'],
name: item.at('name')&.text,
value: item.at('value')&.text&.to_f
}
end
end
end
Registering and Using Custom Parsers
class APIClient
include HTTParty
base_uri 'https://api.example.com'
# Register custom parsers
parser CSVParser
parser CustomXMLParser
def self.fetch_csv_data
get('/data.csv')
end
def self.fetch_xml_report
get('/reports/latest.xml')
end
end
# Usage
csv_data = APIClient.fetch_csv_data
puts csv_data.class # => Array (from CSV parser)
xml_data = APIClient.fetch_xml_report
puts xml_data[:metadata] # => Custom parsed metadata
Method 2: Using Response Callbacks
For simpler parsing needs, you can use response callbacks to process data after it's been parsed by the default parsers.
class APIClient
include HTTParty
base_uri 'https://api.example.com'
# Response callback for post-processing
after_request do |request, response|
if response.content_type == 'application/json'
# Add custom fields to JSON responses
response.parsed_response['processed_at'] = Time.current
response.parsed_response['request_info'] = {
method: request.http_method,
uri: request.uri.to_s
}
end
end
def self.fetch_user(id)
get("/users/#{id}")
end
end
Method 3: Custom Response Processing with Blocks
You can process responses using blocks for one-off custom parsing:
class DataProcessor
include HTTParty
base_uri 'https://data.example.com'
def self.fetch_and_process_metrics
response = get('/metrics') do |response|
case response.content_type
when /json/
process_json_metrics(response.parsed_response)
when /xml/
process_xml_metrics(response.body)
when /text\/plain/
process_text_metrics(response.body)
else
{ error: "Unsupported format: #{response.content_type}" }
end
end
response
end
private
def self.process_json_metrics(data)
{
total_requests: data['requests'],
avg_response_time: data['metrics']['avg_response_time'],
success_rate: (data['successful'] / data['total'].to_f) * 100
}
end
def self.process_xml_metrics(xml_body)
require 'nokogiri'
doc = Nokogiri::XML(xml_body)
{
total_requests: doc.at('//requests')&.text&.to_i,
avg_response_time: doc.at('//avg_response_time')&.text&.to_f,
errors: doc.xpath('//error').map(&:text)
}
end
def self.process_text_metrics(text_body)
lines = text_body.split("\n")
metrics = {}
lines.each do |line|
key, value = line.split(': ')
metrics[key.downcase.gsub(' ', '_')] = value if key && value
end
metrics
end
end
Method 4: Creating Response Wrapper Classes
For complex applications, you might want to wrap responses in custom classes that provide domain-specific methods:
class APIResponse
attr_reader :raw_response, :data, :metadata
def initialize(httparty_response)
@raw_response = httparty_response
@data = parse_data
@metadata = extract_metadata
end
def success?
@raw_response.success?
end
def error_message
@data.dig('error', 'message') if @data.is_a?(Hash)
end
def paginated?
@metadata[:pagination].present?
end
def next_page_url
@metadata.dig(:pagination, :next_url)
end
private
def parse_data
case @raw_response.content_type
when /json/
parsed = @raw_response.parsed_response
# Custom JSON processing
transform_json_keys(parsed)
when /xml/
# Custom XML processing
transform_xml_to_hash(@raw_response.body)
else
@raw_response.body
end
end
def extract_metadata
{
status_code: @raw_response.code,
content_type: @raw_response.content_type,
response_time: @raw_response.headers['x-response-time'],
pagination: extract_pagination_info
}
end
def transform_json_keys(hash)
return hash unless hash.is_a?(Hash)
hash.transform_keys { |key| key.to_s.underscore }
.transform_values { |value| value.is_a?(Hash) ? transform_json_keys(value) : value }
end
def transform_xml_to_hash(xml_body)
require 'nokogiri'
doc = Nokogiri::XML(xml_body)
# Custom XML to hash conversion logic
xml_to_hash(doc.root)
end
def xml_to_hash(node)
if node.children.any? { |child| child.element? }
result = {}
node.children.each do |child|
next unless child.element?
key = child.name.underscore
result[key] = xml_to_hash(child)
end
result
else
node.text
end
end
def extract_pagination_info
if @data.is_a?(Hash) && @data['pagination']
{
current_page: @data['pagination']['current_page'],
total_pages: @data['pagination']['total_pages'],
next_url: @data['pagination']['next_url'],
prev_url: @data['pagination']['prev_url']
}
end
end
end
# Usage with wrapper class
class EnhancedAPIClient
include HTTParty
base_uri 'https://api.example.com'
def self.fetch_users(params = {})
response = get('/users', query: params)
APIResponse.new(response)
end
end
# Using the enhanced client
users_response = EnhancedAPIClient.fetch_users(page: 1, limit: 10)
if users_response.success?
puts "Found #{users_response.data['users'].length} users"
if users_response.paginated?
puts "Next page: #{users_response.next_page_url}"
end
else
puts "Error: #{users_response.error_message}"
end
Advanced Parsing Techniques
Handling Binary Data
class FileDownloader
include HTTParty
def self.download_image(url)
response = get(url, stream_body: true) do |response|
if response.content_type.start_with?('image/')
{
content_type: response.content_type,
size: response.body.length,
data: response.body,
filename: extract_filename(response.headers['content-disposition'])
}
else
{ error: 'Not an image file' }
end
end
response
end
private
def self.extract_filename(content_disposition)
return nil unless content_disposition
content_disposition[/filename[^;=\n]*=((['"]).*?\2|[^;\n]*)/, 1]
.delete('"')
end
end
Conditional Parsing Based on Response Headers
class SmartAPIClient
include HTTParty
parser Proc.new do |response, format|
api_version = response.headers['x-api-version']
case api_version
when '1.0'
parse_v1_response(response.body, format)
when '2.0'
parse_v2_response(response.body, format)
else
# Default parsing
HTTParty::Parser.call(response.body, format)
end
end
private
def self.parse_v1_response(body, format)
data = HTTParty::Parser.call(body, format)
# Transform v1 format to standardized format
transform_v1_to_standard(data)
end
def self.parse_v2_response(body, format)
# v2 is already in standard format
HTTParty::Parser.call(body, format)
end
def self.transform_v1_to_standard(data)
return data unless data.is_a?(Hash)
{
version: '1.0',
data: data['payload'],
metadata: {
timestamp: data['timestamp'],
request_id: data['req_id']
}
}
end
end
Error Handling in Custom Parsers
class RobustParser < HTTParty::Parser
SupportedFormats = {
'application/json' => :safe_json,
'text/xml' => :safe_xml
}.freeze
def safe_json
JSON.parse(body)
rescue JSON::ParserError => e
{
error: 'JSON parsing failed',
message: e.message,
raw_body: body[0, 500] # First 500 chars for debugging
}
end
def safe_xml
require 'nokogiri'
Nokogiri::XML(body) do |config|
config.strict.nonet
end
rescue Nokogiri::XML::SyntaxError => e
{
error: 'XML parsing failed',
message: e.message,
line: e.line,
column: e.column
}
end
end
Testing Custom Parsers
# spec/parsers/csv_parser_spec.rb
RSpec.describe CSVParser do
let(:csv_body) { "name,age,city\nJohn,30,NYC\nJane,25,LA" }
let(:response) { double('response', body: csv_body) }
describe '#csv' do
subject { described_class.new(csv_body, 'text/csv') }
it 'parses CSV data correctly' do
result = subject.csv
expect(result).to be_an(Array)
expect(result.first).to include(name: 'John', age: '30', city: 'NYC')
end
end
end
Best Practices for Custom Response Parsing
Handle Errors Gracefully: Always include error handling in your custom parsers to prevent application crashes from malformed data.
Validate Input: Check content types and response codes before attempting to parse.
Performance Considerations: For large responses, consider streaming parsers or processing data in chunks.
Memory Management: Be mindful of memory usage when parsing large datasets.
Testing: Write comprehensive tests for your custom parsers with various input scenarios.
Documentation: Document your custom parsing logic and expected input/output formats.
Custom response parsing with HTTParty provides the flexibility needed for complex data processing scenarios. Whether you're working with proprietary formats, need specialized validation, or want to create domain-specific response objects, these techniques will help you build robust and maintainable parsing solutions.
When implementing custom parsing, consider combining it with proper error handling techniques and authentication strategies for a complete web scraping solution.