Notice: while this tool is still available, we recommend 4CAT
for Instagram scraping, as it offers more analysis possibilities and is maintained more actively.
This tool interfaces with the API of Instagram to retrieve overviews of posts for a given set of usernames or hashtags.
As large result sets take a long time to process, for each query a limited amount of results is returned. Furthermore, up to 5 queries may be scraped simultaneously; any further queries are ignored.
To not put too much load on the DMI server and to avoid triggering Instagram's API limits, scrapes are placed in a queue and only one scrape may run at a time. Your place in the queue will be shown to you when your scrape is first queued, and you can follow the progress in the output window.
If you close the window while your scrape is still active, or still in the queue, you can later look up its progress and results via the 'Past jobs' menu at the top of the page.
The following output is available:
- A CSV file containing metadata for all scraped posts.
- A JSON file containing metadata for all scraped posts.
- A HTML table containing metadata for all scraped posts.
- A GXF graph, compatible with Gephi, with co-tag data. Co-tags are tags that appear together; the weight of the connection between two tags is determined by how often they appear together in the scraped posts.
- A GXF graph, compatible with Gephi, with post/hashtag data. Nodes consist of posts and hashtags; connections are formed between a post and all hashtags appearing in it.
- (Only when querying usernames) A CSV file with metadata for all scraped users.
The tool only scrapes posts. Stories, pinned stories or IGTV content are not included in the scrape.