Table of contents

How can I extract Google Search related queries and suggestions?

Google Search provides several types of related queries and suggestions that can be valuable for SEO research, content strategy, and market analysis. These include autocomplete suggestions, "People also ask" questions, and related searches at the bottom of search results pages. This guide will show you how to extract these elements using various web scraping techniques.

Understanding Google's Suggestion Types

Google offers several types of related queries and suggestions:

  1. Autocomplete suggestions - Appear as you type in the search box
  2. People also ask - Expandable questions related to your search
  3. Related searches - Keywords shown at the bottom of search results
  4. Search refinements - Alternative query suggestions displayed alongside results

Method 1: Using Python with Requests and BeautifulSoup

Here's how to extract related searches from Google search results pages:

import requests
from bs4 import BeautifulSoup
import time
import random

def get_google_suggestions(query):
    """Extract related searches from Google search results"""

    # Set up headers to mimic a real browser
    headers = {
        'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36',
        'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
        'Accept-Language': 'en-US,en;q=0.5',
        'Accept-Encoding': 'gzip, deflate',
        'Connection': 'keep-alive',
        'Upgrade-Insecure-Requests': '1'
    }

    # Construct search URL
    search_url = f"https://www.google.com/search?q={query.replace(' ', '+')}"

    try:
        # Add random delay to avoid rate limiting
        time.sleep(random.uniform(1, 3))

        response = requests.get(search_url, headers=headers)
        response.raise_for_status()

        soup = BeautifulSoup(response.content, 'html.parser')

        # Extract "People also ask" questions
        people_also_ask = []
        paa_elements = soup.find_all('div', {'class': lambda x: x and 'related-question-pair' in x})
        for element in paa_elements:
            question = element.find('span')
            if question:
                people_also_ask.append(question.get_text().strip())

        # Extract related searches (bottom of page)
        related_searches = []
        related_elements = soup.find_all('div', {'class': lambda x: x and 'BNeawe' in str(x)})
        for element in related_elements:
            text = element.get_text().strip()
            if text and len(text.split()) <= 6:  # Filter reasonable length suggestions
                related_searches.append(text)

        return {
            'query': query,
            'people_also_ask': people_also_ask[:5],  # Limit to first 5
            'related_searches': list(set(related_searches))[:10]  # Remove duplicates, limit to 10
        }

    except requests.exceptions.RequestException as e:
        print(f"Error fetching data: {e}")
        return None

# Example usage
suggestions = get_google_suggestions("web scraping python")
print("People Also Ask:")
for question in suggestions['people_also_ask']:
    print(f"- {question}")

print("\nRelated Searches:")
for search in suggestions['related_searches']:
    print(f"- {search}")

Method 2: Using Google Autocomplete API

Google provides an unofficial autocomplete API that you can use to get search suggestions:

import requests
import json

def get_autocomplete_suggestions(query, num_suggestions=10):
    """Get Google autocomplete suggestions using the suggestion API"""

    url = "http://suggestqueries.google.com/complete/search"
    params = {
        'client': 'firefox',
        'q': query,
        'hl': 'en'  # Language
    }

    try:
        response = requests.get(url, params=params)
        response.raise_for_status()

        # Parse JSON response
        suggestions_data = json.loads(response.text)
        suggestions = suggestions_data[1][:num_suggestions]

        return {
            'query': query,
            'autocomplete_suggestions': suggestions
        }

    except Exception as e:
        print(f"Error getting autocomplete suggestions: {e}")
        return None

# Example usage
autocomplete = get_autocomplete_suggestions("machine learning")
print("Autocomplete Suggestions:")
for suggestion in autocomplete['autocomplete_suggestions']:
    print(f"- {suggestion}")

Method 3: Using Puppeteer (JavaScript)

For dynamic content and more reliable extraction, use Puppeteer to render the page completely:

const puppeteer = require('puppeteer');

async function getGoogleSuggestions(query) {
    const browser = await puppeteer.launch({
        headless: true,
        args: ['--no-sandbox', '--disable-setuid-sandbox']
    });

    try {
        const page = await browser.newPage();

        // Set user agent to avoid detection
        await page.setUserAgent('Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36');

        // Navigate to Google search
        const searchUrl = `https://www.google.com/search?q=${encodeURIComponent(query)}`;
        await page.goto(searchUrl, { waitUntil: 'networkidle2' });

        // Extract "People also ask" questions
        const peopleAlsoAsk = await page.evaluate(() => {
            const questions = [];
            const paaElements = document.querySelectorAll('[data-initq]');

            paaElements.forEach(element => {
                const questionText = element.textContent?.trim();
                if (questionText) {
                    questions.push(questionText);
                }
            });

            return questions;
        });

        // Extract related searches
        const relatedSearches = await page.evaluate(() => {
            const searches = [];
            const relatedElements = document.querySelectorAll('a[data-ved] span');

            relatedElements.forEach(element => {
                const searchText = element.textContent?.trim();
                if (searchText && searchText.length > 3 && searchText.length < 100) {
                    searches.push(searchText);
                }
            });

            // Remove duplicates and return unique searches
            return [...new Set(searches)];
        });

        return {
            query,
            peopleAlsoAsk: peopleAlsoAsk.slice(0, 5),
            relatedSearches: relatedSearches.slice(0, 10)
        };

    } catch (error) {
        console.error('Error extracting suggestions:', error);
        return null;
    } finally {
        await browser.close();
    }
}

// Example usage
(async () => {
    const suggestions = await getGoogleSuggestions('web scraping tools');

    console.log('People Also Ask:');
    suggestions.peopleAlsoAsk.forEach(question => {
        console.log(`- ${question}`);
    });

    console.log('\nRelated Searches:');
    suggestions.relatedSearches.forEach(search => {
        console.log(`- ${search}`);
    });
})();

Method 4: Extracting Autocomplete Suggestions with Puppeteer

To capture real-time autocomplete suggestions as users type:

async function getAutocompleteSuggestions(query) {
    const browser = await puppeteer.launch({ headless: false }); // Set to true for production
    const page = await browser.newPage();

    try {
        await page.goto('https://www.google.com');

        // Wait for search box and focus on it
        await page.waitForSelector('input[name="q"]');
        const searchBox = await page.$('input[name="q"]');

        // Type the query character by character to trigger autocomplete
        await searchBox.type(query, { delay: 100 });

        // Wait for suggestions to appear
        await page.waitForSelector('.wM6W7d', { timeout: 5000 });

        // Extract autocomplete suggestions
        const suggestions = await page.evaluate(() => {
            const suggestionElements = document.querySelectorAll('.wM6W7d span');
            return Array.from(suggestionElements).map(el => el.textContent.trim());
        });

        return suggestions.filter(s => s.length > 0);

    } catch (error) {
        console.error('Error getting autocomplete:', error);
        return [];
    } finally {
        await browser.close();
    }
}

Best Practices and Considerations

1. Rate Limiting and Respectful Scraping

Always implement proper rate limiting to avoid being blocked:

import time
import random

def respectful_delay():
    """Add random delay between requests"""
    delay = random.uniform(2, 5)  # 2-5 seconds
    time.sleep(delay)

# Use between requests
respectful_delay()

2. Rotating User Agents and Headers

Use different user agents to appear more like organic traffic:

USER_AGENTS = [
    'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36',
    'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36',
    'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36'
]

headers = {
    'User-Agent': random.choice(USER_AGENTS),
    'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
    'Accept-Language': 'en-US,en;q=0.9',
    'Accept-Encoding': 'gzip, deflate',
    'Connection': 'keep-alive'
}

3. Error Handling and Resilience

Implement robust error handling for production use:

def safe_extract_suggestions(query, max_retries=3):
    """Extract suggestions with retry logic"""
    for attempt in range(max_retries):
        try:
            result = get_google_suggestions(query)
            if result:
                return result
        except Exception as e:
            print(f"Attempt {attempt + 1} failed: {e}")
            if attempt < max_retries - 1:
                time.sleep(2 ** attempt)  # Exponential backoff

    return None

Advanced Techniques

Extracting Search Refinements

Google often shows search refinements and filters. Here's how to extract them:

def extract_search_refinements(soup):
    """Extract search refinement suggestions"""
    refinements = []

    # Look for refinement chips or buttons
    refinement_elements = soup.find_all('div', {'class': lambda x: x and 'refinement' in str(x).lower()})

    for element in refinement_elements:
        refinement_text = element.get_text().strip()
        if refinement_text:
            refinements.append(refinement_text)

    return refinements

Handling JavaScript-Heavy Content

For pages with dynamic content, consider using browser automation techniques with Puppeteer to ensure all suggestions load properly. You may also need to handle AJAX requests that populate suggestion data.

Legal and Ethical Considerations

When scraping Google search suggestions:

  1. Respect robots.txt - Check Google's robots.txt file
  2. Use reasonable request rates - Don't overwhelm Google's servers
  3. Consider Google's Custom Search API - For commercial use, consider official APIs
  4. Review terms of service - Ensure compliance with Google's terms

Troubleshooting Common Issues

Issue 1: Empty Results

  • Verify CSS selectors are current (Google frequently updates their HTML structure)
  • Check if JavaScript is required to load content
  • Ensure proper headers are set

Issue 2: Getting Blocked

  • Implement longer delays between requests
  • Use proxy rotation
  • Vary request patterns and headers

Issue 3: Inconsistent Data

  • Google personalizes results based on location and search history
  • Use incognito mode or clear cookies between requests
  • Consider using VPN services for consistent geographic results

Conclusion

Extracting Google search suggestions and related queries can provide valuable insights for SEO and content strategy. The methods shown above range from simple HTTP requests to sophisticated browser automation. Choose the approach that best fits your needs while respecting Google's terms of service and implementing proper rate limiting.

For complex scenarios requiring browser session management or handling dynamic content, Puppeteer provides the most reliable solution, though it requires more resources than simple HTTP requests.

Remember to always test your scraping code thoroughly and monitor for changes in Google's page structure, as these can break your extraction logic without warning.

Try WebScraping.AI for Your Web Scraping Needs

Looking for a powerful web scraping solution? WebScraping.AI provides an LLM-powered API that combines Chromium JavaScript rendering with rotating proxies for reliable data extraction.

Key Features:

  • AI-powered extraction: Ask questions about web pages or extract structured data fields
  • JavaScript rendering: Full Chromium browser support for dynamic content
  • Rotating proxies: Datacenter and residential proxies from multiple countries
  • Easy integration: Simple REST API with SDKs for Python, Ruby, PHP, and more
  • Reliable & scalable: Built for developers who need consistent results

Getting Started:

Get page content with AI analysis:

curl "https://api.webscraping.ai/ai/question?url=https://example.com&question=What is the main topic?&api_key=YOUR_API_KEY"

Extract structured data:

curl "https://api.webscraping.ai/ai/fields?url=https://example.com&fields[title]=Page title&fields[price]=Product price&api_key=YOUR_API_KEY"

Try in request builder

Related Questions

Get Started Now

WebScraping.AI provides rotating proxies, Chromium rendering and built-in HTML parser for web scraping
Icon