Scrapes an instagram user's photos and videos

Overview

Instagram Scraper

PyPI Build Status

instagram-scraper is a command-line application written in Python that scrapes and downloads an instagram user's photos and videos. Use responsibly.

Install

To install instagram-scraper:

$ pip install instagram-scraper

To update instagram-scraper:

$ pip install instagram-scraper --upgrade

Alternatively, you can clone the project and run the following command to install: Make sure you cd into the instagram-scraper-master folder before performing the command below.

$ python setup.py install

Usage

To scrape a user's media:

$ instagram-scraper <username> -u <your username> -p <your password>             

NOTE: To scrape a private user's media you must be an approved follower.

By default, downloaded media will be placed in <current working directory>/<username>.

Providing username and password is optional, if not supplied the scraper runs as a guest. Note: In this case all private user's media will be unavailable. All user's stories and high resolution profile pictures will also be unavailable.

To scrape a hashtag for media:

$ instagram-scraper <hashtag without #> --tag          

It may be useful to specify the --maximum <#> argument to limit the total number of items to scrape when scraping by hashtag.

To specify multiple users, pass a delimited list of users:

$ instagram-scraper username1,username2,username3           

You can also supply a file containing a list of usernames:

$ instagram-scraper -f ig_users.txt           
# ig_users.txt

username1
username2
username3

# and so on...

The usernames may be separated by newlines, commas, semicolons, or whitespace.

You can also supply a file containing a list of location ids:

$ instagram-scraper --tag <your_tag_here> --include-location --filter_location_file my_locations.txt           
# my_locations.txt
[some_reagion1]
location_id1
location_id2

[some_region2]
location_id3
location_id4

# and so on...

The resulting directory structure will be:

your_tag
├── some_reagion1
│   └── images_here
└── some_reagion2
    └── images_here

The locations can only be separated by newlines and spaces.

OPTIONS

--help -h               Show help message and exit.

--login-user  -u        Instagram login user.

--login-pass  -p        Instagram login password.

--followings-input      Use profiles followed by login-user as input

--followings-output     Output profiles from --followings-input to file

--filename    -f        Path to a file containing a list of users to scrape.

--destination -d        Specify the download destination. By default, media will 
                        be downloaded to <current working directory>/<username>.

--retain-username -n    Creates a username subdirectory when the destination flag is
                        set.

--media-types -t        Specify media types to scrape. Enter as space separated values. 
                        Valid values are image, video, story (story-image & story-video), broadcast
                        or none. Stories require a --login-user and --login-pass to be defined.
                      
--latest                Scrape only new media since the last scrape. Uses the last modified
                        time of the latest media item in the destination directory to compare.

--latest-stamps         Specify a file to save the timestamps of latest media scraped by user.
                        This works similarly to `--latest` except the file specified by
                        `--latest-stamps` will store the last modified time instead of using 
                        timestamps of media items in the destination directory. 
                        This allows the destination directories to be emptied whilst 
                        still maintaining history.

--cookiejar             File in which to store cookies so that they can be reused between runs.

--quiet       -q        Be quiet while scraping.

--maximum     -m        Maximum number of items to scrape.

--media-metadata        Saves the media metadata associated with the user's posts to 
                        <destination>/<username>.json. Can be combined with --media-types none
                        to only fetch the metadata without downloading the media.

--include-location      Includes location metadata when saving media metadata. 
                        Implicitly includes --media-metadata.

--profile-metadata      Saves the user profile metadata to  <destination>/<username>.json.

--proxies               Enable use of proxies, add a valid JSON with http or/and https urls.
                        Example: '{"http": "http://<ip>:<port>", "https": "https://<ip>:<port>" }'

--comments             Saves the comment metadata associated with the posts to 
                       <destination>/<username>.json. Implicitly includes --media-metadata.
                    
--interactive -i       Enables interactive login challenge solving. Has 2 modes: SMS and Email

--retry-forever        Retry download attempts endlessly when errors are received

--tag                   Scrapes the specified hashtag for media.

--filter                Scrapes the specified hashtag within a user's media.

--filter_location       Filter scrape queries by command line location(s) ids

--filter_location_file  Provide location ids by file to filter queries 

--location              Scrapes the specified instagram location-id for media.

--search-location       Search for a location by name. Useful for determining the location-id of 
                        a specific place.
                    
--template -T           Customize and format each file's name.
                        Default: {urlname}
                        Options:
                        {username}: Scraped user
                        {shortcode}: Post shortcode (profile_pic and story are empty)
                        {urlname}: Original file name from url.
                        {mediatype}: The type of media being downloaded.
                        {datetime}: Date and time of upload. (Format: 20180101 01h01m01s)
                        {date}: Date of upload. (Format: 20180101)
                        {year}: Year of upload. (Format: 2018)
                        {month}: Month of upload. (Format: 01-12)
                        {day}: Day of upload. (Format: 01-31)
                        {h}: Hour of upload. (Format: 00-23h)
                        {m}: Minute of upload. (Format: 00-59m)
                        {s}: Second of upload. (Format: 00-59s)

                        If the template is invalid, it will revert to the default.
                        Does not work with --tag and --location.

Develop

Clone the repo and create a virtualenv

$ virtualenv venv
$ source venv/bin/activate
$ python setup.py develop

Running Tests

$ python setup.py test

# or just 

$ nosetests

Contributing

  1. Check the open issues or open a new issue to start a discussion around your feature idea or the bug you found
  2. Fork the repository, make your changes, and add yourself to AUTHORS.md
  3. Send a pull request

License

This is free and unencumbered software released into the public domain.

Anyone is free to copy, modify, publish, use, compile, sell, or distribute this software, either in source code form or as a compiled binary, for any purpose, commercial or non-commercial, and by any means.

In jurisdictions that recognize copyright laws, the author or authors of this software dedicate any and all copyright interest in the software to the public domain. We make this dedication for the benefit of the public at large and to the detriment of our heirs and successors. We intend this dedication to be an overt act of relinquishment in perpetuity of all present and future rights to this software under copyright law.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

You might also like...
Instagram-follower-bot - An Instagram follower bot written in Python

Instagram Follower Bot An Instagram follower bot written in Python. The bot follows the follower of which account you want. e.g. (You want to follow @

Instagram - Instagram Account Reporting Tool

Instagram Instagram Account Reporting Tool Installation On Termux $ apt update $

Upload-Instagram - Auto Uploading Instagram Bot

###Instagram Uploading Bot### Download Python and Chrome browser pip install -r

A Telegram bot to download posts, videos, reels, IGTV and a user profile picture from Instagram!
A Telegram bot to download posts, videos, reels, IGTV and a user profile picture from Instagram!

Telegram Bot A telegram bot to download media from Instagram! No API Key or Login Needed! Requirements You must have python installed (of course) You

This is a simple program that uses Python and pyTwitchAPI to retrieve the list of users in a streamer's chat and then checks each one of these users to see if they follow the broadcaster or not

This is a simple program that uses Python and pyTwitchAPI to retrieve the list of users in a streamer's chat and then checks each one of these users to see if they follow the broadcaster or not

This is Instagram reposter that repost TikTok videos.

from-tiktok-to-instagram-reposter This script reposts videos from Tik Tok to your Instagram account. You must enter the username and password and slee

Discord bot code to stop users that are scamming with fake messages of free discord nitro on servers in order to steal users accounts.
Discord bot code to stop users that are scamming with fake messages of free discord nitro on servers in order to steal users accounts.

AntiScam Discord bot code to stop users that are scamming with fake messages of free discord nitro on servers in order to steal users accounts. How to

A Bot Telegram Anti Users Channel to automatic ban users who using channel to send message in group.

Tg_Anti_UsersChannel A Bot Telegram Anti Users Channel to automatic ban users who using channel to send message in group. Features: Automatic ban Whit

Telegram bot to stream videos in telegram voicechat for both groups and channels. Supports live strams, YouTube videos and telegram media.

Telegram VCVideoPlayBot An Telegram Bot By @ZauteKm To Stream Videos in Telegram Voice Chat. NOTE: Make sure you have started a VoiceChat in your Grou

Releases(v1.11.0)
  • v1.11.0(Jun 17, 2022)

    What's Changed

    • Remove Python 2.7 supporting. by @AlexNik in https://github.com/arc298/instagram-scraper/pull/781
    • Add fileno method for class LockedStream by @fgremler in https://github.com/arc298/instagram-scraper/pull/782
    • Add explanation to KeyError: 'csrftoken' by @nns33213 in https://github.com/arc298/instagram-scraper/pull/788
    • Fix https://github.com/arc298/instagram-scraper/issues/805 by @nns33213 in https://github.com/arc298/instagram-scraper/pull/809

    New Contributors

    • @fgremler made their first contribution in https://github.com/arc298/instagram-scraper/pull/782

    Full Changelog: https://github.com/arc298/instagram-scraper/compare/v1.10.6...v1.11.0

    Source code(tar.gz)
    Source code(zip)
  • v1.10.6(Feb 24, 2022)

    What's Changed

    • Urgent: don't swallow WEBP posts by @nns33213 in https://github.com/arc298/instagram-scraper/pull/779

    Full Changelog: https://github.com/arc298/instagram-scraper/compare/v1.10.5...v1.10.6

    Source code(tar.gz)
    Source code(zip)
  • v1.10.5(Feb 14, 2022)

    What's Changed

    • Added way to collect stories metadata without downloading video/images by @akorchyn in https://github.com/arc298/instagram-scraper/pull/753
    • Fix file download, properly show if profile is private by @nns33213 in https://github.com/arc298/instagram-scraper/pull/762
    • Processing failed status response. Fixes #765, fixes #764 by @VasiliPupkin256 in https://github.com/arc298/instagram-scraper/pull/775
    • Fix --followings-output path by @Roffild in https://github.com/arc298/instagram-scraper/pull/718

    New Contributors

    • @akorchyn made their first contribution in https://github.com/arc298/instagram-scraper/pull/753

    Full Changelog: https://github.com/arc298/instagram-scraper/compare/1.10.4...v1.10.5

    Source code(tar.gz)
    Source code(zip)
  • v1.10.4(Jan 21, 2022)

  • v1.10.3(Nov 10, 2021)

  • v1.10.2(Aug 18, 2021)

  • v1.10.1(Aug 1, 2021)

    Changes

    • Docker build suport. #660
    • Extra info from account. #688
    • Exit when login is failed. #677
    • If JSON parse fails, parse as HTML. #512
    • Fall back to alternate content-length header #511
    Source code(tar.gz)
    Source code(zip)
  • v1.10.0(May 19, 2021)

  • v1.9.1(Oct 10, 2020)

    Changes

    • Get profile pictures if it exists. #523
    • Adds broadcast to values in --media-types flag. #531
    • Adds new shareddata format extraction. Fixes #578, #515. #580

    Bug Fixes

    • Fixes #525 bug that allows duplicate post metadata by comparing the ID of the posts before saving them to json. Ensure the posts we saved are newer than the timestamp provided by --latest-stamps arg. If this arg is not used then all posts collected are saved. #532
    • Fixes null reference error when getting location #568
    • Fix for video_resources #594. #595
    Source code(tar.gz)
    Source code(zip)
  • v1.9.0(Apr 12, 2020)

    New Features

    • Get live video / broadcast #501

    Changes

    • Change the file format when using filter_locations_file to an ini file format #504

    Bug Fixes

    • Fix bug related to search-location when 'lat' or 'lng' are not found #505
    • Handles missing has_anonymous_profile_picture key from response #506
    Source code(tar.gz)
    Source code(zip)
  • v1.8.1(Feb 7, 2020)

    Bug Fixes

    • Fixes include_location argument error #489
    • Fixes typo during argument checking #489
    • Workaround for Bad Gateway error #488
    • Force using IPv4 when the machine supports IPv6 networks #493
    Source code(tar.gz)
    Source code(zip)
  • 1.8.0(Feb 1, 2020)

    New Features

    • Adds location filtering for tag and location scrap queries #483
    • Adds fetching up highlight stories #481

    Changes

    • Get profile metadata without login #477
    • Update Instagram version in STORIES_UA #474

    Bug Fixes

    • Fixes help message for --proxies #479
    Source code(tar.gz)
    Source code(zip)
  • 1.7.1(Aug 15, 2019)

  • 1.7.0(Aug 10, 2019)

  • v1.6.1(May 18, 2019)

  • 1.6.0(Mar 31, 2019)

    New Features

    • Adds --profile-metadata argument to enable scraping for profile metadata #341
    • Add ability to use proxy while using instagram session #348

    Changes

    • Make logging in as guest default behavior if no username/pw specified #327
    • Changes error log to info log since we can get profile information when user is private #356

    Bug Fixes

    • Fixes bug when using --latest parameter.
    Source code(tar.gz)
    Source code(zip)
  • v1.5.41(Dec 31, 2018)

    Bug Fixes

    • Fixes 403 Forbidden for url #316
    • Adds command-line option to store cookies at end of scraping and reload them in the next use #324
    Source code(tar.gz)
    Source code(zip)
  • v1.5.40(Oct 10, 2018)

  • v1.5.39(Aug 27, 2018)

  • 1.5.38(Aug 14, 2018)

  • 1.5.37(Jul 29, 2018)

    Changes

    • Avoids creation of empty directory if there is nothing to download #256
    • Uses query parameters if static url is forbidden and range requests handling improvement #263
    • Updates user-agents to a more recent version #265
    • Fixes --template not being applied to profile picture #271
    Source code(tar.gz)
    Source code(zip)
  • 1.5.36(May 27, 2018)

  • 1.5.35(May 25, 2018)

  • 1.5.34(May 24, 2018)

    Bug Fixes

    • Fix Login Csrftoken #233 #231 #234
    • Don't redownload the latest item with --latest-stamps #224
    • Avoid re-downloading profile pic with --latest-stamps #223
    Source code(tar.gz)
    Source code(zip)
  • 1.5.32(Apr 19, 2018)

  • 1.5.31(Apr 14, 2018)

    Bug Fixes

    • Due to changes to Instagram's API, authentication by supplying a login user (-u) and password (-p) arguments is now mandatory, adds x-instagram-gis header for authentication #210, #212, #213
    Source code(tar.gz)
    Source code(zip)
  • 1.5.28(Apr 8, 2018)

  • 1.5.27(Apr 4, 2018)

  • 1.5.26(Mar 27, 2018)

  • 1.5.25(Mar 24, 2018)

Owner
I love making things, collaborating, and solving interesting problems. Let's build something awesome together.
A discord.py bot template with easy deployment through Github Actions

discord.py bot template A discord.py bot template with easy deployment through Github Actions. You can use this template to just run a Python instance

Thomas Van Iseghem 1 Feb 09, 2022
Discord Mass Edit is a unique, purging related Discord tool that differs from the regular mass delete.

Discord Mass Edit is a unique, purging related Discord tool that differs from the regular mass delete. This tool will automatically edit every message in a chosen channel and change it to a random st

c0mpt0 1 Jul 27, 2022
This is simple maker for level card in discord bot.

mariocard This is simple maker for level card in discord bot in discord.py or pycord. Installing Python 3.8 or higher is required # Linux/macOS pip3 i

3 Jan 29, 2022
DB-Drive-CSV - This is app is can be used to access CSV file as JSON from Google Drive.

DB Drive CSV This is app is can be used to access CSV file as JSON from Google Drive. How To Use Create file/ upload file to Google Drive There's 2 fi

Hartawan Bahari M. 5 Oct 20, 2022
Filters to block and remove copycat-websites from DuckDuckGo and Google

uBlock Origin - Shitty Copy-Paste websites filter Filter for uBlock origin to remove spam-website results from DuckDuckGo and Google that just blatant

99 Dec 15, 2022
An App to get Ko-Fi payment updates on Telegram.

Deployments. Heroku.com 🚀 Replit.com 🌀 Make sure your app runs 24*7 Zeet.co 💪 Use this :~ Get Bot token from @botfather 🤖 Get ID where you want to

Jainam Oswal 16 Nov 12, 2022
Wechat based auto reply with pyautogui

Python-微信 自动回复 练手~ 一直想做个给微信发个消息,就可以跑Python程序,并将结果发送给我的东西,之前看了 B站@不高兴就喝水 的视频,终于有了灵感~ 使用的是模拟点击方案,请求期间是不能操作了。 库 pyautogui 用于模拟鼠标键盘操作和定位操作位置 pyperclip 剪贴板

Vito Song 1 Oct 22, 2022
A Discord bot that controls Pico-8.

Pico-8 Discord Bot Synopsis: A Discord bot that controls Pico-8. Please let me know if you make any games with this tool! I will simplify the discord.

Camden 1 Jan 28, 2022
Python library for Spurwing API to schedule appointments, manage calendars and custom integrations.

Spurwing API Python Library Lightweight Python library for Spurwing's API. Spurwing's API makes it easy to add robust scheduling and booking to your a

Spurwing 1 Jul 14, 2021
A Discord bot that rewards players in Minecraft for sending messages on Discord

MCRewards-Discord-Bot A Discord bot that rewards players in Minecraft for sending messages on Discord How to setup: Download this git as a .zip, or cl

3 Dec 26, 2021
An Python SDK for QQ based on mirai-api-http v2.

Argon 一个基于 graia-broadcast 和 mirai-api-http v2 的 Python SDK。 本项目适用于 mirai-api-http 2.0 以上版本。 目前仍处于开发阶段,内部接口可能会有较大的变化。 The Stasis / 停滞 为维持 GraiaProject

BlueGlassBlock 1 Oct 29, 2021
Modified Version Of Media Search bot

Modified Version Of Media Search bot

1 Oct 09, 2021
Send Informative, Concise Slack Notifications With Minimal Effort

slack-templates Send Informative, Concise Slack Notifications With Minimal Effort slack-templates Slack Integration Available Templates Usage Report t

9 Nov 03, 2022
Ein PY-Skript, mit dem tiled-Editor-Maps bearbeitet werden

tilesetCopyrighter Ein PY-Skript, mit dem tiled-Editor-Maps bearbeitet werden können fügt je Tileset eine custom-Property tilesetCopyright (string) hi

1 Dec 26, 2021
Streaming Finance Data with AWS Lambda

A data pipeline consisting of an AWS lambda function reading data from yfinance API, an AWS Kinesis stream to receive & store data in S3 buckets and AWS Glue crawler & Athena to run SQL queries.

Aarif Munwar Jahan 4 Aug 30, 2022
A command line interface for accessing google drive

Drive Cli Get the ability to access Google Drive without leaving your terminal. Inspiration Google Drive has become a vital part of our day to day lif

Chirag Shetty 538 Dec 12, 2022
The Best Telegram UserBot Made With Pyrogram [Python]

Asterix UserBot A Powerful Telegram userbot based on Pyrogram. How To Deploy Asterix Heroku Railway Qovery Termux Tutorial Railway Deploy Comming Soon

TeamAsterix 9 Oct 17, 2022
a Disqus alternative

Isso – a commenting server similar to Disqus Isso – Ich schrei sonst – is a lightweight commenting server written in Python and JavaScript. It aims to

Martin Zimmermann 4.7k Jan 02, 2023
A simple worker for OpenClubhouse to sync data.

OpenClubhouse-Worker This is a simple worker for OpenClubhouse to sync CH channel data.

100 Dec 17, 2022
A python package for fetching informations from GitHub API

Py-GitHub A python package for fetching informations from GitHub API Made with Python3 (C) @FayasNoushad Copyright permission under MIT License Licens

Fayas Noushad 6 Nov 28, 2021