How do I upload files using multipart/form-data with Requests?
Uploading files using multipart/form-data is a common requirement in web scraping and API interactions. The Python Requests library provides several convenient methods to handle file uploads efficiently. This comprehensive guide covers everything you need to know about uploading files with Requests.
Understanding Multipart/Form-Data
Multipart/form-data is an encoding type used in HTTP requests to upload files and binary data. Unlike standard form data, this encoding allows you to send files along with other form fields in a single request. The content is divided into multiple parts, each containing different data types.
Basic File Upload
The simplest way to upload a file with Requests is using the files
parameter:
import requests
# Basic file upload
with open('document.pdf', 'rb') as file:
files = {'file': file}
response = requests.post('https://httpbin.org/post', files=files)
print(response.status_code)
print(response.json())
This automatically sets the Content-Type
header to multipart/form-data
and handles the encoding for you.
Uploading Multiple Files
You can upload multiple files in a single request by providing multiple entries in the files dictionary:
import requests
# Upload multiple files
files = {
'file1': open('document1.pdf', 'rb'),
'file2': open('document2.txt', 'rb'),
'file3': open('image.png', 'rb')
}
try:
response = requests.post('https://httpbin.org/post', files=files)
print(f"Status: {response.status_code}")
print(response.json())
finally:
# Always close file handles
for file in files.values():
file.close()
Combining Files with Form Data
Often, you need to send additional form fields along with files. Use both files
and data
parameters:
import requests
# Upload file with additional form data
form_data = {
'username': 'john_doe',
'description': 'Document upload',
'category': 'reports'
}
with open('report.pdf', 'rb') as file:
files = {'document': file}
response = requests.post(
'https://api.example.com/upload',
files=files,
data=form_data
)
print(response.status_code)
Advanced File Upload Configuration
Custom Filename and Content Type
You can specify custom filenames and content types using tuples:
import requests
# Custom filename and content type
with open('data.json', 'rb') as file:
files = {
'file': ('custom_name.json', file, 'application/json')
}
response = requests.post('https://httpbin.org/post', files=files)
Uploading from Memory
Upload data directly from memory without creating temporary files:
import requests
import io
# Upload from memory
data = b"This is file content in memory"
files = {
'file': ('memory_file.txt', io.BytesIO(data), 'text/plain')
}
response = requests.post('https://httpbin.org/post', files=files)
print(response.status_code)
Using requests-toolbelt for Advanced Multipart
For more complex multipart requirements, use the requests-toolbelt
library:
import requests
from requests_toolbelt.multipart.encoder import MultipartEncoder
# Install with: pip install requests-toolbelt
multipart_data = MultipartEncoder(
fields={
'field1': 'value1',
'field2': ('filename.txt', open('file.txt', 'rb'), 'text/plain'),
'field3': 'value3'
}
)
response = requests.post(
'https://httpbin.org/post',
data=multipart_data,
headers={'Content-Type': multipart_data.content_type}
)
Error Handling and Best Practices
Proper File Handling
Always use context managers or try-finally blocks to ensure files are properly closed:
import requests
def upload_file_safely(file_path, upload_url):
try:
with open(file_path, 'rb') as file:
files = {'file': file}
response = requests.post(upload_url, files=files)
response.raise_for_status() # Raise exception for bad status codes
return response.json()
except requests.exceptions.RequestException as e:
print(f"Upload failed: {e}")
return None
except FileNotFoundError:
print(f"File not found: {file_path}")
return None
# Usage
result = upload_file_safely('document.pdf', 'https://api.example.com/upload')
Large File Upload with Progress
For large files, implement progress tracking:
import requests
from requests_toolbelt.multipart.encoder import MultipartEncoder
from requests_toolbelt.multipart import encoder
def upload_with_progress(file_path, upload_url):
def progress_callback(monitor):
progress = (monitor.bytes_read / monitor.len) * 100
print(f"Upload progress: {progress:.1f}%")
with open(file_path, 'rb') as file:
multipart_data = MultipartEncoder(
fields={'file': (file_path, file, 'application/octet-stream')}
)
monitor = encoder.MultipartEncoderMonitor(
multipart_data,
progress_callback
)
response = requests.post(
upload_url,
data=monitor,
headers={'Content-Type': monitor.content_type}
)
return response
# Usage
response = upload_with_progress('large_file.zip', 'https://api.example.com/upload')
Authentication and Headers
Many APIs require authentication for file uploads:
import requests
# Upload with authentication
headers = {
'Authorization': 'Bearer your-api-token',
'X-Custom-Header': 'custom-value'
}
with open('secure_document.pdf', 'rb') as file:
files = {'file': file}
data = {'visibility': 'private'}
response = requests.post(
'https://secure-api.example.com/upload',
files=files,
data=data,
headers=headers
)
if response.status_code == 200:
print("Upload successful!")
print(response.json())
else:
print(f"Upload failed: {response.status_code}")
Web Scraping Context
When web scraping, you might need to upload files to forms. This is particularly useful when automating workflows that involve document submission. While Requests handles the HTTP layer, you might also need tools like handling authentication flows or monitoring network requests that precede file uploads.
Debugging Upload Issues
Inspect Request Details
Debug upload issues by examining the actual request:
import requests
import logging
# Enable debug logging
logging.basicConfig(level=logging.DEBUG)
requests_log = logging.getLogger("requests.packages.urllib3")
requests_log.setLevel(logging.DEBUG)
requests_log.propagate = True
# Make upload request with debugging
with open('debug_file.txt', 'rb') as file:
files = {'file': file}
response = requests.post('https://httpbin.org/post', files=files)
Validate Server Response
Always check server responses for upload validation:
import requests
def validate_upload_response(response):
if response.status_code == 200:
try:
data = response.json()
if 'files' in data:
print("Upload successful!")
return True
except ValueError:
print("Invalid JSON response")
elif response.status_code == 413:
print("File too large")
elif response.status_code == 415:
print("Unsupported file type")
else:
print(f"Upload failed: {response.status_code} - {response.text}")
return False
# Usage
with open('test_file.txt', 'rb') as file:
files = {'file': file}
response = requests.post('https://httpbin.org/post', files=files)
validate_upload_response(response)
Performance Optimization
Session Reuse
Use sessions for multiple uploads to reuse connections:
import requests
def bulk_upload(file_paths, upload_url):
session = requests.Session()
session.headers.update({'Authorization': 'Bearer your-token'})
results = []
for file_path in file_paths:
try:
with open(file_path, 'rb') as file:
files = {'file': file}
response = session.post(upload_url, files=files)
results.append({
'file': file_path,
'status': response.status_code,
'success': response.status_code == 200
})
except Exception as e:
results.append({
'file': file_path,
'status': 'error',
'error': str(e)
})
session.close()
return results
# Upload multiple files efficiently
files_to_upload = ['file1.pdf', 'file2.txt', 'file3.jpg']
results = bulk_upload(files_to_upload, 'https://api.example.com/upload')
Console Commands and Testing
Test your file upload implementation using these console commands:
# Test endpoint with curl for comparison
curl -X POST \
-F "file=@/path/to/file.pdf" \
-F "field1=value1" \
https://httpbin.org/post
# Check file size before upload
ls -lh file.pdf
# Monitor upload with curl progress
curl -X POST \
-F "file=@large_file.zip" \
--progress-bar \
https://api.example.com/upload
JavaScript Alternative
If you need to upload files from JavaScript in web scraping contexts:
// Using fetch API for file upload
const formData = new FormData();
formData.append('file', fileInput.files[0]);
formData.append('description', 'Uploaded file');
fetch('https://api.example.com/upload', {
method: 'POST',
body: formData,
headers: {
'Authorization': 'Bearer your-token'
}
})
.then(response => response.json())
.then(data => console.log('Upload successful:', data))
.catch(error => console.error('Upload failed:', error));
Common Pitfalls and Solutions
- File Handle Leaks: Always close files properly using context managers
- Large File Memory Usage: Use streaming for large files with
requests-toolbelt
- Incorrect Content-Type: Let Requests set multipart headers automatically
- Binary vs Text Mode: Always open files in binary mode (
'rb'
) for uploads - Authentication Issues: Ensure proper headers are set before upload
- File Path Issues: Use absolute paths or verify working directory
- Network Timeouts: Set appropriate timeout values for large uploads
Testing File Uploads
Create a simple test function to verify your upload implementation:
import requests
import tempfile
import os
def test_file_upload():
# Create a temporary file for testing
with tempfile.NamedTemporaryFile(mode='w', delete=False, suffix='.txt') as temp_file:
temp_file.write("This is test content for file upload")
temp_file_path = temp_file.name
try:
# Test the upload
with open(temp_file_path, 'rb') as file:
files = {'file': file}
response = requests.post('https://httpbin.org/post', files=files)
assert response.status_code == 200
response_data = response.json()
assert 'files' in response_data
print("File upload test passed!")
finally:
# Clean up temporary file
os.unlink(temp_file_path)
# Run the test
test_file_upload()
Conclusion
The Python Requests library provides powerful and flexible options for uploading files using multipart/form-data. Whether you're uploading single files, multiple files, or combining files with form data, Requests handles the complexity of multipart encoding automatically. Remember to implement proper error handling, use context managers for file operations, and consider using requests-toolbelt
for advanced multipart requirements.
For complex web scraping scenarios involving file uploads, you might need to combine Requests with other tools like Puppeteer for handling file downloads to complete the entire workflow from initial page navigation to final file submission.