Table of contents

What is the Best Way to Export Scraped Data to Excel from n8n?

Exporting scraped data to Excel from n8n workflows is a common requirement for data analysis, reporting, and business intelligence. Unlike CSV exports, Excel files (.xlsx) support advanced features like multiple sheets, cell formatting, formulas, charts, and data validation. This comprehensive guide covers multiple approaches to export your scraped data to Excel format, from simple built-in solutions to advanced custom formatting techniques.

Why Choose Excel Over CSV?

Before diving into implementation, it's worth understanding when Excel exports are preferable:

  • Multiple worksheets - Organize different data types in separate sheets
  • Rich formatting - Apply colors, fonts, borders, and number formats
  • Formulas and calculations - Add computed columns and totals
  • Charts and graphs - Include visualizations alongside data
  • Data validation - Define dropdown lists and input constraints
  • Freeze panes and filters - Improve data navigation
  • Cell protection - Lock specific cells or sheets

Method 1: Using the Spreadsheet File Node (Recommended)

The Spreadsheet File node in n8n is the simplest and most reliable way to create Excel files from scraped data. This built-in node handles all the complexity of the XLSX format automatically.

Basic Excel Export Workflow

A typical n8n workflow for exporting scraped data to Excel follows this structure:

  1. Data Source - Webhook, Schedule, or Manual trigger
  2. Web Scraping - HTTP Request, HTML Extract, or Code node
  3. Data Processing - Transform and clean data
  4. Spreadsheet File - Convert to Excel format
  5. Storage/Delivery - Save to cloud storage or email

Configuration Example

Here's how to configure the Spreadsheet File node for Excel export:

Node Settings: - Operation: Write to File - File Format: XLSX (Excel) - Options: - Sheet Name: Scraped Data - Include Headers: ✓ - Auto Width: ✓

// Sample scraped data structure
[
  {
    "product_name": "Wireless Mouse",
    "price": 24.99,
    "rating": 4.5,
    "reviews_count": 1250,
    "availability": "In Stock",
    "url": "https://example.com/mouse"
  },
  {
    "product_name": "Mechanical Keyboard",
    "price": 89.99,
    "rating": 4.8,
    "reviews_count": 3420,
    "availability": "In Stock",
    "url": "https://example.com/keyboard"
  }
]

The Spreadsheet File node automatically converts this JSON structure into a properly formatted Excel file with headers and data rows.

Method 2: Advanced Excel Generation with Code Node

For complete control over Excel formatting, use the Code Node with the popular xlsx library. This approach enables custom styling, multiple sheets, formulas, and advanced Excel features.

JavaScript Implementation with xlsx Library

// n8n Code Node - JavaScript
const XLSX = require('xlsx');

// Get scraped data from previous node
const items = $input.all();
const data = items.map(item => item.json);

// Create a new workbook
const workbook = XLSX.utils.book_new();

// Convert data to worksheet
const worksheet = XLSX.utils.json_to_sheet(data);

// Set column widths
worksheet['!cols'] = [
  { wch: 30 },  // Product Name
  { wch: 12 },  // Price
  { wch: 10 },  // Rating
  { wch: 15 },  // Reviews Count
  { wch: 15 },  // Availability
  { wch: 50 }   // URL
];

// Add worksheet to workbook
XLSX.utils.book_append_sheet(workbook, worksheet, 'Products');

// Create metadata sheet
const metadataSheet = XLSX.utils.aoa_to_sheet([
  ['Scrape Date', new Date().toISOString()],
  ['Total Records', data.length],
  ['Source', 'n8n Web Scraper'],
  ['Workflow', $workflow.name]
]);

XLSX.utils.book_append_sheet(workbook, metadataSheet, 'Metadata');

// Write to buffer
const excelBuffer = XLSX.write(workbook, {
  type: 'buffer',
  bookType: 'xlsx'
});

// Return as binary data
return [{
  json: {
    records: data.length,
    filename: `scraped_data_${Date.now()}.xlsx`
  },
  binary: {
    data: {
      data: excelBuffer.toString('base64'),
      mimeType: 'application/vnd.openxmlformats-officedocument.spreadsheetml.sheet',
      fileName: `scraped_data_${Date.now()}.xlsx`
    }
  }
}];

Python Implementation with openpyxl

# n8n Code Node - Python
from openpyxl import Workbook
from openpyxl.styles import Font, PatternFill, Alignment
import base64
from io import BytesIO
from datetime import datetime

# Get input data
items = _input.all()
data = [item['json'] for item in items]

# Create workbook
wb = Workbook()
ws = wb.active
ws.title = "Scraped Data"

# Add headers with formatting
headers = ['Product Name', 'Price', 'Rating', 'Reviews', 'Availability', 'URL']
header_fill = PatternFill(start_color='366092', end_color='366092', fill_type='solid')
header_font = Font(color='FFFFFF', bold=True)

for col, header in enumerate(headers, start=1):
    cell = ws.cell(row=1, column=col, value=header)
    cell.fill = header_fill
    cell.font = header_font
    cell.alignment = Alignment(horizontal='center')

# Add data rows
for row_idx, item in enumerate(data, start=2):
    ws.cell(row=row_idx, column=1, value=item.get('product_name'))
    ws.cell(row=row_idx, column=2, value=item.get('price'))
    ws.cell(row=row_idx, column=3, value=item.get('rating'))
    ws.cell(row=row_idx, column=4, value=item.get('reviews_count'))
    ws.cell(row=row_idx, column=5, value=item.get('availability'))
    ws.cell(row=row_idx, column=6, value=item.get('url'))

# Set column widths
ws.column_dimensions['A'].width = 30
ws.column_dimensions['B'].width = 12
ws.column_dimensions['C'].width = 10
ws.column_dimensions['D'].width = 15
ws.column_dimensions['E'].width = 15
ws.column_dimensions['F'].width = 50

# Freeze top row
ws.freeze_panes = 'A2'

# Save to buffer
buffer = BytesIO()
wb.save(buffer)
excel_data = buffer.getvalue()
excel_base64 = base64.b64encode(excel_data).decode()

# Return binary data
return [{
    'json': {
        'records': len(data),
        'filename': f'scraped_data_{int(datetime.now().timestamp())}.xlsx'
    },
    'binary': {
        'data': {
            'data': excel_base64,
            'mimeType': 'application/vnd.openxmlformats-officedocument.spreadsheetml.sheet',
            'fileName': f'scraped_data_{int(datetime.now().timestamp())}.xlsx'
        }
    }
}]

Method 3: Creating Multi-Sheet Excel Files

One of Excel's most powerful features is the ability to organize related data across multiple worksheets. This is particularly useful when scraping different data types or categories.

Multi-Sheet Example

// Code Node - Multiple sheets for different categories
const XLSX = require('xlsx');
const items = $input.all();

// Organize data by category
const categories = {};
items.forEach(item => {
  const category = item.json.category || 'Uncategorized';
  if (!categories[category]) {
    categories[category] = [];
  }
  categories[category].push(item.json);
});

// Create workbook
const workbook = XLSX.utils.book_new();

// Add a sheet for each category
Object.keys(categories).forEach(category => {
  const worksheet = XLSX.utils.json_to_sheet(categories[category]);

  // Auto-fit columns
  const cols = Object.keys(categories[category][0]).map(() => ({ wch: 15 }));
  worksheet['!cols'] = cols;

  XLSX.utils.book_append_sheet(workbook, worksheet, category);
});

// Add summary sheet
const summaryData = Object.keys(categories).map(cat => ({
  Category: cat,
  'Item Count': categories[cat].length,
  'Avg Price': (categories[cat].reduce((sum, item) =>
    sum + (parseFloat(item.price) || 0), 0) / categories[cat].length).toFixed(2)
}));

const summarySheet = XLSX.utils.json_to_sheet(summaryData);
XLSX.utils.book_append_sheet(workbook, summarySheet, 'Summary', true);

// Export to buffer
const buffer = XLSX.write(workbook, { type: 'buffer', bookType: 'xlsx' });

return [{
  json: { sheets: Object.keys(categories).length + 1 },
  binary: {
    data: {
      data: buffer.toString('base64'),
      mimeType: 'application/vnd.openxmlformats-officedocument.spreadsheetml.sheet',
      fileName: `categorized_data_${Date.now()}.xlsx`
    }
  }
}];

Method 4: Advanced Formatting and Styling

When presenting data to stakeholders, professional formatting makes Excel exports more readable and actionable.

Adding Cell Formatting

// Advanced formatting with xlsx-style or exceljs
const ExcelJS = require('exceljs');

const workbook = new ExcelJS.Workbook();
const worksheet = workbook.addWorksheet('Scraped Data');

// Define columns with formatting
worksheet.columns = [
  { header: 'Product Name', key: 'product_name', width: 30 },
  { header: 'Price', key: 'price', width: 12, style: { numFmt: '$#,##0.00' } },
  { header: 'Rating', key: 'rating', width: 10, style: { numFmt: '0.0' } },
  { header: 'Reviews', key: 'reviews_count', width: 15, style: { numFmt: '#,##0' } },
  { header: 'Stock Status', key: 'availability', width: 15 }
];

// Style header row
worksheet.getRow(1).font = { bold: true, color: { argb: 'FFFFFFFF' } };
worksheet.getRow(1).fill = {
  type: 'pattern',
  pattern: 'solid',
  fgColor: { argb: 'FF366092' }
};
worksheet.getRow(1).alignment = { vertical: 'middle', horizontal: 'center' };

// Add data
const items = $input.all();
items.forEach(item => {
  const row = worksheet.addRow(item.json);

  // Conditional formatting: highlight low stock
  if (item.json.availability !== 'In Stock') {
    row.getCell('availability').fill = {
      type: 'pattern',
      pattern: 'solid',
      fgColor: { argb: 'FFFFEB9C' }
    };
  }

  // Color-code ratings
  const ratingCell = row.getCell('rating');
  if (item.json.rating >= 4.5) {
    ratingCell.font = { color: { argb: 'FF00B050' } };
  } else if (item.json.rating < 3.0) {
    ratingCell.font = { color: { argb: 'FFFF0000' } };
  }
});

// Add filters
worksheet.autoFilter = {
  from: 'A1',
  to: `E${items.length + 1}`
};

// Write to buffer
const buffer = await workbook.xlsx.writeBuffer();

return [{
  json: { formatted: true },
  binary: {
    data: {
      data: buffer.toString('base64'),
      mimeType: 'application/vnd.openxmlformats-officedocument.spreadsheetml.sheet',
      fileName: `formatted_data_${Date.now()}.xlsx`
    }
  }
}];

Method 5: Saving Excel Files to Various Destinations

After creating your Excel file, you need to save or deliver it. n8n offers multiple storage and delivery options.

Save to Google Drive

// After Spreadsheet File node, add Google Drive node
// Google Drive Node Configuration:
{
  "operation": "upload",
  "name": "scraped_data_{{ $now.format('yyyy-MM-dd_HH-mm') }}.xlsx",
  "parents": {
    "folderId": "YOUR_FOLDER_ID"
  },
  "binaryData": true
}

Send via Email with Attachment

// Email Node Configuration
{
  "operation": "send",
  "fromEmail": "scraper@yourdomain.com",
  "toEmail": "team@yourdomain.com",
  "subject": "Daily Scraping Report - {{ $now.format('yyyy-MM-dd') }}",
  "text": "Attached is today's scraped data report.",
  "attachments": "data",  // Binary data from Spreadsheet File node
  "options": {
    "attachments": [{
      "binaryPropertyName": "data"
    }]
  }
}

Upload to Dropbox

  1. Add Dropbox node after your Excel generation
  2. Set Operation to "Upload"
  3. Configure:
    • Path: /Scraped Data/report_{{ $now.format('yyyy-MM-dd') }}.xlsx
    • Binary Data: ✓

Upload to AWS S3

// AWS S3 Node Configuration
{
  "operation": "upload",
  "bucketName": "your-bucket-name",
  "fileName": "scraped-data/{{ $now.format('yyyy/MM/dd') }}/data.xlsx",
  "binaryData": true,
  "additionalFields": {
    "contentType": "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet"
  }
}

Method 6: Handling Large Datasets

When scraping large amounts of data, consider memory constraints and processing time.

Streaming Large Datasets

// Code Node - Batch processing for large datasets
const XLSX = require('xlsx');
const BATCH_SIZE = 5000;
const items = $input.all();

// Create workbook
const workbook = XLSX.utils.book_new();

// Process in batches
for (let i = 0; i < items.length; i += BATCH_SIZE) {
  const batch = items.slice(i, i + BATCH_SIZE);
  const batchData = batch.map(item => item.json);

  const sheetName = `Data ${Math.floor(i / BATCH_SIZE) + 1}`;
  const worksheet = XLSX.utils.json_to_sheet(batchData);

  XLSX.utils.book_append_sheet(workbook, worksheet, sheetName);
}

// Write to buffer
const buffer = XLSX.write(workbook, { type: 'buffer', bookType: 'xlsx' });

return [{
  json: {
    totalRecords: items.length,
    sheets: Math.ceil(items.length / BATCH_SIZE)
  },
  binary: {
    data: {
      data: buffer.toString('base64'),
      mimeType: 'application/vnd.openxmlformats-officedocument.spreadsheetml.sheet',
      fileName: `large_dataset_${Date.now()}.xlsx`
    }
  }
}];

Scraping Dynamic Content for Excel Export

When scraping JavaScript-heavy websites, you may need browser automation. The n8n integration with Puppeteer for handling dynamic content can be combined with Excel export:

// Code Node - Puppeteer scraping + Excel export
const puppeteer = require('puppeteer');
const XLSX = require('xlsx');

async function scrapeAndExport() {
  const browser = await puppeteer.launch({ headless: true });
  const page = await browser.newPage();

  // Navigate and wait for content
  await page.goto('https://example.com/products', { waitUntil: 'networkidle2' });

  // Extract data
  const products = await page.evaluate(() => {
    return Array.from(document.querySelectorAll('.product')).map(el => ({
      name: el.querySelector('.name')?.textContent.trim(),
      price: el.querySelector('.price')?.textContent.trim(),
      rating: el.querySelector('.rating')?.textContent.trim()
    }));
  });

  await browser.close();

  // Create Excel file
  const workbook = XLSX.utils.book_new();
  const worksheet = XLSX.utils.json_to_sheet(products);
  XLSX.utils.book_append_sheet(workbook, worksheet, 'Products');

  const buffer = XLSX.write(workbook, { type: 'buffer', bookType: 'xlsx' });

  return buffer;
}

const excelBuffer = await scrapeAndExport();

return [{
  json: {},
  binary: {
    data: {
      data: excelBuffer.toString('base64'),
      mimeType: 'application/vnd.openxmlformats-officedocument.spreadsheetml.sheet',
      fileName: `scraped_products_${Date.now()}.xlsx`
    }
  }
}];

Data Validation and Quality Control

Before exporting to Excel, implement data validation to ensure quality:

// Code Node - Data validation before Excel export
const items = $input.all();
const validatedData = [];
const errors = [];

items.forEach((item, index) => {
  const data = item.json;
  const rowErrors = [];

  // Validate required fields
  if (!data.product_name || data.product_name.trim() === '') {
    rowErrors.push('Missing product name');
  }

  // Validate price format
  if (data.price && isNaN(parseFloat(data.price))) {
    rowErrors.push('Invalid price format');
  }

  // Validate rating range
  if (data.rating && (data.rating < 0 || data.rating > 5)) {
    rowErrors.push('Rating out of range');
  }

  // Validate URL format
  if (data.url && !data.url.match(/^https?:\/\/.+/)) {
    rowErrors.push('Invalid URL format');
  }

  if (rowErrors.length > 0) {
    errors.push({
      row: index + 1,
      errors: rowErrors.join('; '),
      data: data
    });
  } else {
    validatedData.push({
      ...data,
      validated_at: new Date().toISOString()
    });
  }
});

// Create Excel with both valid data and error log
const XLSX = require('xlsx');
const workbook = XLSX.utils.book_new();

// Add valid data sheet
const dataSheet = XLSX.utils.json_to_sheet(validatedData);
XLSX.utils.book_append_sheet(workbook, dataSheet, 'Valid Data');

// Add error log sheet if there are errors
if (errors.length > 0) {
  const errorSheet = XLSX.utils.json_to_sheet(errors);
  XLSX.utils.book_append_sheet(workbook, errorSheet, 'Validation Errors');
}

const buffer = XLSX.write(workbook, { type: 'buffer', bookType: 'xlsx' });

return [{
  json: {
    validRecords: validatedData.length,
    errorRecords: errors.length,
    successRate: ((validatedData.length / items.length) * 100).toFixed(2) + '%'
  },
  binary: {
    data: {
      data: buffer.toString('base64'),
      mimeType: 'application/vnd.openxmlformats-officedocument.spreadsheetml.sheet',
      fileName: `validated_data_${Date.now()}.xlsx`
    }
  }
}];

Scheduling Automated Excel Reports

Combine your scraping workflow with n8n's scheduling to create automated Excel reports:

Daily Report Configuration

// Schedule Trigger Node Configuration
{
  "trigger": "cron",
  "cronExpression": "0 8 * * *",  // Daily at 8 AM
  "timezone": "America/New_York"
}

Common Schedule Patterns: - Every hour: 0 * * * * - Daily at 9 AM: 0 9 * * * - Weekly on Monday at 8 AM: 0 8 * * 1 - Monthly on 1st at 6 AM: 0 6 1 * *

Workflow Example: Daily Automated Report

# Complete workflow structure:
Schedule Trigger (Daily 8 AM)
  ↓
HTTP Request (Scrape website)
  ↓
Code Node (Parse and clean data)
  ↓
Code Node (Generate Excel with formatting)
  ↓
Google Drive (Save to shared folder)
  ↓
Email (Send notification with link)

Best Practices for Excel Exports

  1. Use Descriptive Sheet Names - Make navigation easy with clear sheet names
  2. Include Metadata Sheet - Add scrape time, record count, and source information
  3. Apply Formatting Consistently - Use consistent colors, fonts, and number formats
  4. Add Data Validation - Include validation checks before export
  5. Implement Error Logging - Create separate sheets for validation errors
  6. Optimize Column Widths - Auto-fit or set appropriate widths for readability
  7. Add Freeze Panes - Freeze header rows for better navigation
  8. Include Formulas - Add computed columns for totals and averages
  9. Compress Large Files - Use ZIP compression for files over 10MB
  10. Version Control - Include timestamps in filenames for tracking

Troubleshooting Common Issues

Issue: Excel File Won't Open

Cause: Corrupted file or incorrect MIME type

Solution: Ensure you're using the correct MIME type and that the binary data is properly encoded:

mimeType: 'application/vnd.openxmlformats-officedocument.spreadsheetml.sheet'

Issue: Special Characters Display Incorrectly

Cause: Encoding issues

Solution: Ensure UTF-8 encoding is used throughout your workflow:

const buffer = XLSX.write(workbook, {
  type: 'buffer',
  bookType: 'xlsx',
  bookVBA: false,
  compression: true
});

Issue: Formulas Not Calculating

Cause: Excel needs to recalculate on open

Solution: Add calculation properties to workbook:

workbook.Workbook = {
  Views: [{ xWindow: 0, yWindow: 0, windowWidth: 25000, windowHeight: 15000 }]
};

Issue: Memory Issues with Large Datasets

Cause: Loading entire dataset into memory at once

Solution: Implement batch processing and stream data when possible. Consider splitting into multiple files if necessary.

Alternative: Export to Google Sheets

While Excel is powerful, exporting to Google Sheets offers real-time collaboration and automatic cloud backup. Consider Google Sheets when:

  • Multiple users need simultaneous access
  • You need automatic version history
  • Cloud storage is preferred over file downloads
  • You want to use Google Sheets formulas and add-ons

Using WebScraping.AI API for Reliable Data Collection

For production scraping workflows that handle complex authentication scenarios and anti-bot measures, consider using WebScraping.AI API:

// HTTP Request Node - WebScraping.AI API
{
  "method": "GET",
  "url": "https://api.webscraping.ai/html",
  "qs": {
    "api_key": "YOUR_API_KEY",
    "url": "https://example.com/products",
    "js": true,
    "proxy": "datacenter"
  }
}

Benefits include: - Automatic JavaScript rendering - Built-in proxy rotation - CAPTCHA handling - Browser fingerprinting protection - Guaranteed uptime and reliability

Conclusion

Exporting scraped data to Excel from n8n provides powerful options for data analysis and reporting. The Spreadsheet File node offers simplicity for basic use cases, while custom Code nodes with libraries like xlsx or exceljs provide complete control over formatting, multiple sheets, and advanced Excel features.

Choose the method that best fits your needs: - Spreadsheet File Node: Quick exports without custom formatting - XLSX Library: Multiple sheets with basic formatting - ExcelJS Library: Advanced formatting and styling - Google Sheets: Cloud-based collaboration and sharing

Combine these Excel export capabilities with proper workflow automation and scheduled triggers to build professional data extraction pipelines that deliver actionable insights in a familiar, shareable format.

Try WebScraping.AI for Your Web Scraping Needs

Looking for a powerful web scraping solution? WebScraping.AI provides an LLM-powered API that combines Chromium JavaScript rendering with rotating proxies for reliable data extraction.

Key Features:

  • AI-powered extraction: Ask questions about web pages or extract structured data fields
  • JavaScript rendering: Full Chromium browser support for dynamic content
  • Rotating proxies: Datacenter and residential proxies from multiple countries
  • Easy integration: Simple REST API with SDKs for Python, Ruby, PHP, and more
  • Reliable & scalable: Built for developers who need consistent results

Getting Started:

Get page content with AI analysis:

curl "https://api.webscraping.ai/ai/question?url=https://example.com&question=What is the main topic?&api_key=YOUR_API_KEY"

Extract structured data:

curl "https://api.webscraping.ai/ai/fields?url=https://example.com&fields[title]=Page title&fields[price]=Product price&api_key=YOUR_API_KEY"

Try in request builder

Related Questions

Get Started Now

WebScraping.AI provides rotating proxies, Chromium rendering and built-in HTML parser for web scraping
Icon