What is the Best Way to Export Scraped Data to Excel from n8n?
Exporting scraped data to Excel from n8n workflows is a common requirement for data analysis, reporting, and business intelligence. Unlike CSV exports, Excel files (.xlsx) support advanced features like multiple sheets, cell formatting, formulas, charts, and data validation. This comprehensive guide covers multiple approaches to export your scraped data to Excel format, from simple built-in solutions to advanced custom formatting techniques.
Why Choose Excel Over CSV?
Before diving into implementation, it's worth understanding when Excel exports are preferable:
- Multiple worksheets - Organize different data types in separate sheets
- Rich formatting - Apply colors, fonts, borders, and number formats
- Formulas and calculations - Add computed columns and totals
- Charts and graphs - Include visualizations alongside data
- Data validation - Define dropdown lists and input constraints
- Freeze panes and filters - Improve data navigation
- Cell protection - Lock specific cells or sheets
Method 1: Using the Spreadsheet File Node (Recommended)
The Spreadsheet File node in n8n is the simplest and most reliable way to create Excel files from scraped data. This built-in node handles all the complexity of the XLSX format automatically.
Basic Excel Export Workflow
A typical n8n workflow for exporting scraped data to Excel follows this structure:
- Data Source - Webhook, Schedule, or Manual trigger
- Web Scraping - HTTP Request, HTML Extract, or Code node
- Data Processing - Transform and clean data
- Spreadsheet File - Convert to Excel format
- Storage/Delivery - Save to cloud storage or email
Configuration Example
Here's how to configure the Spreadsheet File node for Excel export:
Node Settings:
- Operation: Write to File
- File Format: XLSX (Excel)
- Options:
- Sheet Name: Scraped Data
- Include Headers: ✓
- Auto Width: ✓
// Sample scraped data structure
[
{
"product_name": "Wireless Mouse",
"price": 24.99,
"rating": 4.5,
"reviews_count": 1250,
"availability": "In Stock",
"url": "https://example.com/mouse"
},
{
"product_name": "Mechanical Keyboard",
"price": 89.99,
"rating": 4.8,
"reviews_count": 3420,
"availability": "In Stock",
"url": "https://example.com/keyboard"
}
]
The Spreadsheet File node automatically converts this JSON structure into a properly formatted Excel file with headers and data rows.
Method 2: Advanced Excel Generation with Code Node
For complete control over Excel formatting, use the Code Node with the popular xlsx
library. This approach enables custom styling, multiple sheets, formulas, and advanced Excel features.
JavaScript Implementation with xlsx Library
// n8n Code Node - JavaScript
const XLSX = require('xlsx');
// Get scraped data from previous node
const items = $input.all();
const data = items.map(item => item.json);
// Create a new workbook
const workbook = XLSX.utils.book_new();
// Convert data to worksheet
const worksheet = XLSX.utils.json_to_sheet(data);
// Set column widths
worksheet['!cols'] = [
{ wch: 30 }, // Product Name
{ wch: 12 }, // Price
{ wch: 10 }, // Rating
{ wch: 15 }, // Reviews Count
{ wch: 15 }, // Availability
{ wch: 50 } // URL
];
// Add worksheet to workbook
XLSX.utils.book_append_sheet(workbook, worksheet, 'Products');
// Create metadata sheet
const metadataSheet = XLSX.utils.aoa_to_sheet([
['Scrape Date', new Date().toISOString()],
['Total Records', data.length],
['Source', 'n8n Web Scraper'],
['Workflow', $workflow.name]
]);
XLSX.utils.book_append_sheet(workbook, metadataSheet, 'Metadata');
// Write to buffer
const excelBuffer = XLSX.write(workbook, {
type: 'buffer',
bookType: 'xlsx'
});
// Return as binary data
return [{
json: {
records: data.length,
filename: `scraped_data_${Date.now()}.xlsx`
},
binary: {
data: {
data: excelBuffer.toString('base64'),
mimeType: 'application/vnd.openxmlformats-officedocument.spreadsheetml.sheet',
fileName: `scraped_data_${Date.now()}.xlsx`
}
}
}];
Python Implementation with openpyxl
# n8n Code Node - Python
from openpyxl import Workbook
from openpyxl.styles import Font, PatternFill, Alignment
import base64
from io import BytesIO
from datetime import datetime
# Get input data
items = _input.all()
data = [item['json'] for item in items]
# Create workbook
wb = Workbook()
ws = wb.active
ws.title = "Scraped Data"
# Add headers with formatting
headers = ['Product Name', 'Price', 'Rating', 'Reviews', 'Availability', 'URL']
header_fill = PatternFill(start_color='366092', end_color='366092', fill_type='solid')
header_font = Font(color='FFFFFF', bold=True)
for col, header in enumerate(headers, start=1):
cell = ws.cell(row=1, column=col, value=header)
cell.fill = header_fill
cell.font = header_font
cell.alignment = Alignment(horizontal='center')
# Add data rows
for row_idx, item in enumerate(data, start=2):
ws.cell(row=row_idx, column=1, value=item.get('product_name'))
ws.cell(row=row_idx, column=2, value=item.get('price'))
ws.cell(row=row_idx, column=3, value=item.get('rating'))
ws.cell(row=row_idx, column=4, value=item.get('reviews_count'))
ws.cell(row=row_idx, column=5, value=item.get('availability'))
ws.cell(row=row_idx, column=6, value=item.get('url'))
# Set column widths
ws.column_dimensions['A'].width = 30
ws.column_dimensions['B'].width = 12
ws.column_dimensions['C'].width = 10
ws.column_dimensions['D'].width = 15
ws.column_dimensions['E'].width = 15
ws.column_dimensions['F'].width = 50
# Freeze top row
ws.freeze_panes = 'A2'
# Save to buffer
buffer = BytesIO()
wb.save(buffer)
excel_data = buffer.getvalue()
excel_base64 = base64.b64encode(excel_data).decode()
# Return binary data
return [{
'json': {
'records': len(data),
'filename': f'scraped_data_{int(datetime.now().timestamp())}.xlsx'
},
'binary': {
'data': {
'data': excel_base64,
'mimeType': 'application/vnd.openxmlformats-officedocument.spreadsheetml.sheet',
'fileName': f'scraped_data_{int(datetime.now().timestamp())}.xlsx'
}
}
}]
Method 3: Creating Multi-Sheet Excel Files
One of Excel's most powerful features is the ability to organize related data across multiple worksheets. This is particularly useful when scraping different data types or categories.
Multi-Sheet Example
// Code Node - Multiple sheets for different categories
const XLSX = require('xlsx');
const items = $input.all();
// Organize data by category
const categories = {};
items.forEach(item => {
const category = item.json.category || 'Uncategorized';
if (!categories[category]) {
categories[category] = [];
}
categories[category].push(item.json);
});
// Create workbook
const workbook = XLSX.utils.book_new();
// Add a sheet for each category
Object.keys(categories).forEach(category => {
const worksheet = XLSX.utils.json_to_sheet(categories[category]);
// Auto-fit columns
const cols = Object.keys(categories[category][0]).map(() => ({ wch: 15 }));
worksheet['!cols'] = cols;
XLSX.utils.book_append_sheet(workbook, worksheet, category);
});
// Add summary sheet
const summaryData = Object.keys(categories).map(cat => ({
Category: cat,
'Item Count': categories[cat].length,
'Avg Price': (categories[cat].reduce((sum, item) =>
sum + (parseFloat(item.price) || 0), 0) / categories[cat].length).toFixed(2)
}));
const summarySheet = XLSX.utils.json_to_sheet(summaryData);
XLSX.utils.book_append_sheet(workbook, summarySheet, 'Summary', true);
// Export to buffer
const buffer = XLSX.write(workbook, { type: 'buffer', bookType: 'xlsx' });
return [{
json: { sheets: Object.keys(categories).length + 1 },
binary: {
data: {
data: buffer.toString('base64'),
mimeType: 'application/vnd.openxmlformats-officedocument.spreadsheetml.sheet',
fileName: `categorized_data_${Date.now()}.xlsx`
}
}
}];
Method 4: Advanced Formatting and Styling
When presenting data to stakeholders, professional formatting makes Excel exports more readable and actionable.
Adding Cell Formatting
// Advanced formatting with xlsx-style or exceljs
const ExcelJS = require('exceljs');
const workbook = new ExcelJS.Workbook();
const worksheet = workbook.addWorksheet('Scraped Data');
// Define columns with formatting
worksheet.columns = [
{ header: 'Product Name', key: 'product_name', width: 30 },
{ header: 'Price', key: 'price', width: 12, style: { numFmt: '$#,##0.00' } },
{ header: 'Rating', key: 'rating', width: 10, style: { numFmt: '0.0' } },
{ header: 'Reviews', key: 'reviews_count', width: 15, style: { numFmt: '#,##0' } },
{ header: 'Stock Status', key: 'availability', width: 15 }
];
// Style header row
worksheet.getRow(1).font = { bold: true, color: { argb: 'FFFFFFFF' } };
worksheet.getRow(1).fill = {
type: 'pattern',
pattern: 'solid',
fgColor: { argb: 'FF366092' }
};
worksheet.getRow(1).alignment = { vertical: 'middle', horizontal: 'center' };
// Add data
const items = $input.all();
items.forEach(item => {
const row = worksheet.addRow(item.json);
// Conditional formatting: highlight low stock
if (item.json.availability !== 'In Stock') {
row.getCell('availability').fill = {
type: 'pattern',
pattern: 'solid',
fgColor: { argb: 'FFFFEB9C' }
};
}
// Color-code ratings
const ratingCell = row.getCell('rating');
if (item.json.rating >= 4.5) {
ratingCell.font = { color: { argb: 'FF00B050' } };
} else if (item.json.rating < 3.0) {
ratingCell.font = { color: { argb: 'FFFF0000' } };
}
});
// Add filters
worksheet.autoFilter = {
from: 'A1',
to: `E${items.length + 1}`
};
// Write to buffer
const buffer = await workbook.xlsx.writeBuffer();
return [{
json: { formatted: true },
binary: {
data: {
data: buffer.toString('base64'),
mimeType: 'application/vnd.openxmlformats-officedocument.spreadsheetml.sheet',
fileName: `formatted_data_${Date.now()}.xlsx`
}
}
}];
Method 5: Saving Excel Files to Various Destinations
After creating your Excel file, you need to save or deliver it. n8n offers multiple storage and delivery options.
Save to Google Drive
// After Spreadsheet File node, add Google Drive node
// Google Drive Node Configuration:
{
"operation": "upload",
"name": "scraped_data_{{ $now.format('yyyy-MM-dd_HH-mm') }}.xlsx",
"parents": {
"folderId": "YOUR_FOLDER_ID"
},
"binaryData": true
}
Send via Email with Attachment
// Email Node Configuration
{
"operation": "send",
"fromEmail": "scraper@yourdomain.com",
"toEmail": "team@yourdomain.com",
"subject": "Daily Scraping Report - {{ $now.format('yyyy-MM-dd') }}",
"text": "Attached is today's scraped data report.",
"attachments": "data", // Binary data from Spreadsheet File node
"options": {
"attachments": [{
"binaryPropertyName": "data"
}]
}
}
Upload to Dropbox
- Add Dropbox node after your Excel generation
- Set Operation to "Upload"
- Configure:
- Path:
/Scraped Data/report_{{ $now.format('yyyy-MM-dd') }}.xlsx
- Binary Data: ✓
- Path:
Upload to AWS S3
// AWS S3 Node Configuration
{
"operation": "upload",
"bucketName": "your-bucket-name",
"fileName": "scraped-data/{{ $now.format('yyyy/MM/dd') }}/data.xlsx",
"binaryData": true,
"additionalFields": {
"contentType": "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet"
}
}
Method 6: Handling Large Datasets
When scraping large amounts of data, consider memory constraints and processing time.
Streaming Large Datasets
// Code Node - Batch processing for large datasets
const XLSX = require('xlsx');
const BATCH_SIZE = 5000;
const items = $input.all();
// Create workbook
const workbook = XLSX.utils.book_new();
// Process in batches
for (let i = 0; i < items.length; i += BATCH_SIZE) {
const batch = items.slice(i, i + BATCH_SIZE);
const batchData = batch.map(item => item.json);
const sheetName = `Data ${Math.floor(i / BATCH_SIZE) + 1}`;
const worksheet = XLSX.utils.json_to_sheet(batchData);
XLSX.utils.book_append_sheet(workbook, worksheet, sheetName);
}
// Write to buffer
const buffer = XLSX.write(workbook, { type: 'buffer', bookType: 'xlsx' });
return [{
json: {
totalRecords: items.length,
sheets: Math.ceil(items.length / BATCH_SIZE)
},
binary: {
data: {
data: buffer.toString('base64'),
mimeType: 'application/vnd.openxmlformats-officedocument.spreadsheetml.sheet',
fileName: `large_dataset_${Date.now()}.xlsx`
}
}
}];
Scraping Dynamic Content for Excel Export
When scraping JavaScript-heavy websites, you may need browser automation. The n8n integration with Puppeteer for handling dynamic content can be combined with Excel export:
// Code Node - Puppeteer scraping + Excel export
const puppeteer = require('puppeteer');
const XLSX = require('xlsx');
async function scrapeAndExport() {
const browser = await puppeteer.launch({ headless: true });
const page = await browser.newPage();
// Navigate and wait for content
await page.goto('https://example.com/products', { waitUntil: 'networkidle2' });
// Extract data
const products = await page.evaluate(() => {
return Array.from(document.querySelectorAll('.product')).map(el => ({
name: el.querySelector('.name')?.textContent.trim(),
price: el.querySelector('.price')?.textContent.trim(),
rating: el.querySelector('.rating')?.textContent.trim()
}));
});
await browser.close();
// Create Excel file
const workbook = XLSX.utils.book_new();
const worksheet = XLSX.utils.json_to_sheet(products);
XLSX.utils.book_append_sheet(workbook, worksheet, 'Products');
const buffer = XLSX.write(workbook, { type: 'buffer', bookType: 'xlsx' });
return buffer;
}
const excelBuffer = await scrapeAndExport();
return [{
json: {},
binary: {
data: {
data: excelBuffer.toString('base64'),
mimeType: 'application/vnd.openxmlformats-officedocument.spreadsheetml.sheet',
fileName: `scraped_products_${Date.now()}.xlsx`
}
}
}];
Data Validation and Quality Control
Before exporting to Excel, implement data validation to ensure quality:
// Code Node - Data validation before Excel export
const items = $input.all();
const validatedData = [];
const errors = [];
items.forEach((item, index) => {
const data = item.json;
const rowErrors = [];
// Validate required fields
if (!data.product_name || data.product_name.trim() === '') {
rowErrors.push('Missing product name');
}
// Validate price format
if (data.price && isNaN(parseFloat(data.price))) {
rowErrors.push('Invalid price format');
}
// Validate rating range
if (data.rating && (data.rating < 0 || data.rating > 5)) {
rowErrors.push('Rating out of range');
}
// Validate URL format
if (data.url && !data.url.match(/^https?:\/\/.+/)) {
rowErrors.push('Invalid URL format');
}
if (rowErrors.length > 0) {
errors.push({
row: index + 1,
errors: rowErrors.join('; '),
data: data
});
} else {
validatedData.push({
...data,
validated_at: new Date().toISOString()
});
}
});
// Create Excel with both valid data and error log
const XLSX = require('xlsx');
const workbook = XLSX.utils.book_new();
// Add valid data sheet
const dataSheet = XLSX.utils.json_to_sheet(validatedData);
XLSX.utils.book_append_sheet(workbook, dataSheet, 'Valid Data');
// Add error log sheet if there are errors
if (errors.length > 0) {
const errorSheet = XLSX.utils.json_to_sheet(errors);
XLSX.utils.book_append_sheet(workbook, errorSheet, 'Validation Errors');
}
const buffer = XLSX.write(workbook, { type: 'buffer', bookType: 'xlsx' });
return [{
json: {
validRecords: validatedData.length,
errorRecords: errors.length,
successRate: ((validatedData.length / items.length) * 100).toFixed(2) + '%'
},
binary: {
data: {
data: buffer.toString('base64'),
mimeType: 'application/vnd.openxmlformats-officedocument.spreadsheetml.sheet',
fileName: `validated_data_${Date.now()}.xlsx`
}
}
}];
Scheduling Automated Excel Reports
Combine your scraping workflow with n8n's scheduling to create automated Excel reports:
Daily Report Configuration
// Schedule Trigger Node Configuration
{
"trigger": "cron",
"cronExpression": "0 8 * * *", // Daily at 8 AM
"timezone": "America/New_York"
}
Common Schedule Patterns:
- Every hour: 0 * * * *
- Daily at 9 AM: 0 9 * * *
- Weekly on Monday at 8 AM: 0 8 * * 1
- Monthly on 1st at 6 AM: 0 6 1 * *
Workflow Example: Daily Automated Report
# Complete workflow structure:
Schedule Trigger (Daily 8 AM)
↓
HTTP Request (Scrape website)
↓
Code Node (Parse and clean data)
↓
Code Node (Generate Excel with formatting)
↓
Google Drive (Save to shared folder)
↓
Email (Send notification with link)
Best Practices for Excel Exports
- Use Descriptive Sheet Names - Make navigation easy with clear sheet names
- Include Metadata Sheet - Add scrape time, record count, and source information
- Apply Formatting Consistently - Use consistent colors, fonts, and number formats
- Add Data Validation - Include validation checks before export
- Implement Error Logging - Create separate sheets for validation errors
- Optimize Column Widths - Auto-fit or set appropriate widths for readability
- Add Freeze Panes - Freeze header rows for better navigation
- Include Formulas - Add computed columns for totals and averages
- Compress Large Files - Use ZIP compression for files over 10MB
- Version Control - Include timestamps in filenames for tracking
Troubleshooting Common Issues
Issue: Excel File Won't Open
Cause: Corrupted file or incorrect MIME type
Solution: Ensure you're using the correct MIME type and that the binary data is properly encoded:
mimeType: 'application/vnd.openxmlformats-officedocument.spreadsheetml.sheet'
Issue: Special Characters Display Incorrectly
Cause: Encoding issues
Solution: Ensure UTF-8 encoding is used throughout your workflow:
const buffer = XLSX.write(workbook, {
type: 'buffer',
bookType: 'xlsx',
bookVBA: false,
compression: true
});
Issue: Formulas Not Calculating
Cause: Excel needs to recalculate on open
Solution: Add calculation properties to workbook:
workbook.Workbook = {
Views: [{ xWindow: 0, yWindow: 0, windowWidth: 25000, windowHeight: 15000 }]
};
Issue: Memory Issues with Large Datasets
Cause: Loading entire dataset into memory at once
Solution: Implement batch processing and stream data when possible. Consider splitting into multiple files if necessary.
Alternative: Export to Google Sheets
While Excel is powerful, exporting to Google Sheets offers real-time collaboration and automatic cloud backup. Consider Google Sheets when:
- Multiple users need simultaneous access
- You need automatic version history
- Cloud storage is preferred over file downloads
- You want to use Google Sheets formulas and add-ons
Using WebScraping.AI API for Reliable Data Collection
For production scraping workflows that handle complex authentication scenarios and anti-bot measures, consider using WebScraping.AI API:
// HTTP Request Node - WebScraping.AI API
{
"method": "GET",
"url": "https://api.webscraping.ai/html",
"qs": {
"api_key": "YOUR_API_KEY",
"url": "https://example.com/products",
"js": true,
"proxy": "datacenter"
}
}
Benefits include: - Automatic JavaScript rendering - Built-in proxy rotation - CAPTCHA handling - Browser fingerprinting protection - Guaranteed uptime and reliability
Conclusion
Exporting scraped data to Excel from n8n provides powerful options for data analysis and reporting. The Spreadsheet File node offers simplicity for basic use cases, while custom Code nodes with libraries like xlsx
or exceljs
provide complete control over formatting, multiple sheets, and advanced Excel features.
Choose the method that best fits your needs: - Spreadsheet File Node: Quick exports without custom formatting - XLSX Library: Multiple sheets with basic formatting - ExcelJS Library: Advanced formatting and styling - Google Sheets: Cloud-based collaboration and sharing
Combine these Excel export capabilities with proper workflow automation and scheduled triggers to build professional data extraction pipelines that deliver actionable insights in a familiar, shareable format.