Datacenter picture
10 minutes reading time

Instagram Scraping in 2021

The history and why is it difficult

Making an Instagram scraper used to be easy and straight-forward. There was a powerful and easy-to-use API, and you could just load an URL like and get all the data.
The URL method still works, but there are a few caveats explained below.

Over the recent years, Instagram has made a lot of changes to their site to make scraping harder.

Here are some of those changes:

  • Their old API was shut down. The new one is very restrictive and linked with Facebook API.
  • Authentication is required to access their site from datacenter IPs
  • Authentication is required after a few visits from residential IPs

You can see a history of these changes by reading these StackOverflow questions and answers:

Working ways to do it

All of the current ways of accessing Instagram data revolve around using ?__a=1 and using their internal GraphQL API.

Here are some of open-source projects doing it:

Another way to do it is to use a sessionid token cookie while doing your requests, but such method violates Instagram TOS and will get your account banned.

How to do it on WebScraping.AI

To scrape Instagram data you need to use proxy=residential parameter on your request. We rotate proxies on every requests so Instagram won't recognise your request as a bot and won't require auth. The only downside of using residential proxies is the price: datacenter proxies are much cheaper.

An example of such request:

const request = require('request');
const requestPromise = require('request-promise');

const options = {
method: 'GET',
url: '',
qs: {
api_key: "test-api-key",
url: '',
proxy: 'residential',
timeout: 15000

await requestPromise(options);

// Click “▶ run” to try this code live.

Get Started Now

WebScraping.AI provides rotating proxies, Chrome rendering and built-in HTML parser for web scraping
© 2021 WebScraping.AI | Privacy Policy and Terms of Service