Scrapes an instagram user's photos and videos

Overview

Instagram Scraper

PyPI Build Status

instagram-scraper is a command-line application written in Python that scrapes and downloads an instagram user's photos and videos. Use responsibly.

Install

To install instagram-scraper:

$ pip install instagram-scraper

To update instagram-scraper:

$ pip install instagram-scraper --upgrade

Alternatively, you can clone the project and run the following command to install: Make sure you cd into the instagram-scraper-master folder before performing the command below.

$ python setup.py install

Usage

To scrape a user's media:

$ instagram-scraper <username> -u <your username> -p <your password>             

NOTE: To scrape a private user's media you must be an approved follower.

By default, downloaded media will be placed in <current working directory>/<username>.

Providing username and password is optional, if not supplied the scraper runs as a guest. Note: In this case all private user's media will be unavailable. All user's stories and high resolution profile pictures will also be unavailable.

To scrape a hashtag for media:

$ instagram-scraper <hashtag without #> --tag          

It may be useful to specify the --maximum <#> argument to limit the total number of items to scrape when scraping by hashtag.

To specify multiple users, pass a delimited list of users:

$ instagram-scraper username1,username2,username3           

You can also supply a file containing a list of usernames:

$ instagram-scraper -f ig_users.txt           
# ig_users.txt

username1
username2
username3

# and so on...

The usernames may be separated by newlines, commas, semicolons, or whitespace.

You can also supply a file containing a list of location ids:

$ instagram-scraper --tag <your_tag_here> --include-location --filter_location_file my_locations.txt           
# my_locations.txt
[some_reagion1]
location_id1
location_id2

[some_region2]
location_id3
location_id4

# and so on...

The resulting directory structure will be:

your_tag
├── some_reagion1
│   └── images_here
└── some_reagion2
    └── images_here

The locations can only be separated by newlines and spaces.

OPTIONS

--help -h               Show help message and exit.

--login-user  -u        Instagram login user.

--login-pass  -p        Instagram login password.

--followings-input      Use profiles followed by login-user as input

--followings-output     Output profiles from --followings-input to file

--filename    -f        Path to a file containing a list of users to scrape.

--destination -d        Specify the download destination. By default, media will 
                        be downloaded to <current working directory>/<username>.

--retain-username -n    Creates a username subdirectory when the destination flag is
                        set.

--media-types -t        Specify media types to scrape. Enter as space separated values. 
                        Valid values are image, video, story (story-image & story-video), broadcast
                        or none. Stories require a --login-user and --login-pass to be defined.
                      
--latest                Scrape only new media since the last scrape. Uses the last modified
                        time of the latest media item in the destination directory to compare.

--latest-stamps         Specify a file to save the timestamps of latest media scraped by user.
                        This works similarly to `--latest` except the file specified by
                        `--latest-stamps` will store the last modified time instead of using 
                        timestamps of media items in the destination directory. 
                        This allows the destination directories to be emptied whilst 
                        still maintaining history.

--cookiejar             File in which to store cookies so that they can be reused between runs.

--quiet       -q        Be quiet while scraping.

--maximum     -m        Maximum number of items to scrape.

--media-metadata        Saves the media metadata associated with the user's posts to 
                        <destination>/<username>.json. Can be combined with --media-types none
                        to only fetch the metadata without downloading the media.

--include-location      Includes location metadata when saving media metadata. 
                        Implicitly includes --media-metadata.

--profile-metadata      Saves the user profile metadata to  <destination>/<username>.json.

--proxies               Enable use of proxies, add a valid JSON with http or/and https urls.
                        Example: '{"http": "http://<ip>:<port>", "https": "https://<ip>:<port>" }'

--comments             Saves the comment metadata associated with the posts to 
                       <destination>/<username>.json. Implicitly includes --media-metadata.
                    
--interactive -i       Enables interactive login challenge solving. Has 2 modes: SMS and Email

--retry-forever        Retry download attempts endlessly when errors are received

--tag                   Scrapes the specified hashtag for media.

--filter                Scrapes the specified hashtag within a user's media.

--filter_location       Filter scrape queries by command line location(s) ids

--filter_location_file  Provide location ids by file to filter queries 

--location              Scrapes the specified instagram location-id for media.

--search-location       Search for a location by name. Useful for determining the location-id of 
                        a specific place.
                    
--template -T           Customize and format each file's name.
                        Default: {urlname}
                        Options:
                        {username}: Scraped user
                        {shortcode}: Post shortcode (profile_pic and story are empty)
                        {urlname}: Original file name from url.
                        {mediatype}: The type of media being downloaded.
                        {datetime}: Date and time of upload. (Format: 20180101 01h01m01s)
                        {date}: Date of upload. (Format: 20180101)
                        {year}: Year of upload. (Format: 2018)
                        {month}: Month of upload. (Format: 01-12)
                        {day}: Day of upload. (Format: 01-31)
                        {h}: Hour of upload. (Format: 00-23h)
                        {m}: Minute of upload. (Format: 00-59m)
                        {s}: Second of upload. (Format: 00-59s)

                        If the template is invalid, it will revert to the default.
                        Does not work with --tag and --location.

Develop

Clone the repo and create a virtualenv

$ virtualenv venv
$ source venv/bin/activate
$ python setup.py develop

Running Tests

$ python setup.py test

# or just 

$ nosetests

Contributing

  1. Check the open issues or open a new issue to start a discussion around your feature idea or the bug you found
  2. Fork the repository, make your changes, and add yourself to AUTHORS.md
  3. Send a pull request

License

This is free and unencumbered software released into the public domain.

Anyone is free to copy, modify, publish, use, compile, sell, or distribute this software, either in source code form or as a compiled binary, for any purpose, commercial or non-commercial, and by any means.

In jurisdictions that recognize copyright laws, the author or authors of this software dedicate any and all copyright interest in the software to the public domain. We make this dedication for the benefit of the public at large and to the detriment of our heirs and successors. We intend this dedication to be an overt act of relinquishment in perpetuity of all present and future rights to this software under copyright law.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

You might also like...
Instagram-follower-bot - An Instagram follower bot written in Python

Instagram Follower Bot An Instagram follower bot written in Python. The bot follows the follower of which account you want. e.g. (You want to follow @

Instagram - Instagram Account Reporting Tool

Instagram Instagram Account Reporting Tool Installation On Termux $ apt update $

Upload-Instagram - Auto Uploading Instagram Bot

###Instagram Uploading Bot### Download Python and Chrome browser pip install -r

A Telegram bot to download posts, videos, reels, IGTV and a user profile picture from Instagram!
A Telegram bot to download posts, videos, reels, IGTV and a user profile picture from Instagram!

Telegram Bot A telegram bot to download media from Instagram! No API Key or Login Needed! Requirements You must have python installed (of course) You

This is a simple program that uses Python and pyTwitchAPI to retrieve the list of users in a streamer's chat and then checks each one of these users to see if they follow the broadcaster or not

This is a simple program that uses Python and pyTwitchAPI to retrieve the list of users in a streamer's chat and then checks each one of these users to see if they follow the broadcaster or not

This is Instagram reposter that repost TikTok videos.

from-tiktok-to-instagram-reposter This script reposts videos from Tik Tok to your Instagram account. You must enter the username and password and slee

Discord bot code to stop users that are scamming with fake messages of free discord nitro on servers in order to steal users accounts.
Discord bot code to stop users that are scamming with fake messages of free discord nitro on servers in order to steal users accounts.

AntiScam Discord bot code to stop users that are scamming with fake messages of free discord nitro on servers in order to steal users accounts. How to

A Bot Telegram Anti Users Channel to automatic ban users who using channel to send message in group.

Tg_Anti_UsersChannel A Bot Telegram Anti Users Channel to automatic ban users who using channel to send message in group. Features: Automatic ban Whit

Telegram bot to stream videos in telegram voicechat for both groups and channels. Supports live strams, YouTube videos and telegram media.

Telegram VCVideoPlayBot An Telegram Bot By @ZauteKm To Stream Videos in Telegram Voice Chat. NOTE: Make sure you have started a VoiceChat in your Grou

Releases(v1.11.0)
  • v1.11.0(Jun 17, 2022)

    What's Changed

    • Remove Python 2.7 supporting. by @AlexNik in https://github.com/arc298/instagram-scraper/pull/781
    • Add fileno method for class LockedStream by @fgremler in https://github.com/arc298/instagram-scraper/pull/782
    • Add explanation to KeyError: 'csrftoken' by @nns33213 in https://github.com/arc298/instagram-scraper/pull/788
    • Fix https://github.com/arc298/instagram-scraper/issues/805 by @nns33213 in https://github.com/arc298/instagram-scraper/pull/809

    New Contributors

    • @fgremler made their first contribution in https://github.com/arc298/instagram-scraper/pull/782

    Full Changelog: https://github.com/arc298/instagram-scraper/compare/v1.10.6...v1.11.0

    Source code(tar.gz)
    Source code(zip)
  • v1.10.6(Feb 24, 2022)

    What's Changed

    • Urgent: don't swallow WEBP posts by @nns33213 in https://github.com/arc298/instagram-scraper/pull/779

    Full Changelog: https://github.com/arc298/instagram-scraper/compare/v1.10.5...v1.10.6

    Source code(tar.gz)
    Source code(zip)
  • v1.10.5(Feb 14, 2022)

    What's Changed

    • Added way to collect stories metadata without downloading video/images by @akorchyn in https://github.com/arc298/instagram-scraper/pull/753
    • Fix file download, properly show if profile is private by @nns33213 in https://github.com/arc298/instagram-scraper/pull/762
    • Processing failed status response. Fixes #765, fixes #764 by @VasiliPupkin256 in https://github.com/arc298/instagram-scraper/pull/775
    • Fix --followings-output path by @Roffild in https://github.com/arc298/instagram-scraper/pull/718

    New Contributors

    • @akorchyn made their first contribution in https://github.com/arc298/instagram-scraper/pull/753

    Full Changelog: https://github.com/arc298/instagram-scraper/compare/1.10.4...v1.10.5

    Source code(tar.gz)
    Source code(zip)
  • v1.10.4(Jan 21, 2022)

  • v1.10.3(Nov 10, 2021)

  • v1.10.2(Aug 18, 2021)

  • v1.10.1(Aug 1, 2021)

    Changes

    • Docker build suport. #660
    • Extra info from account. #688
    • Exit when login is failed. #677
    • If JSON parse fails, parse as HTML. #512
    • Fall back to alternate content-length header #511
    Source code(tar.gz)
    Source code(zip)
  • v1.10.0(May 19, 2021)

  • v1.9.1(Oct 10, 2020)

    Changes

    • Get profile pictures if it exists. #523
    • Adds broadcast to values in --media-types flag. #531
    • Adds new shareddata format extraction. Fixes #578, #515. #580

    Bug Fixes

    • Fixes #525 bug that allows duplicate post metadata by comparing the ID of the posts before saving them to json. Ensure the posts we saved are newer than the timestamp provided by --latest-stamps arg. If this arg is not used then all posts collected are saved. #532
    • Fixes null reference error when getting location #568
    • Fix for video_resources #594. #595
    Source code(tar.gz)
    Source code(zip)
  • v1.9.0(Apr 12, 2020)

    New Features

    • Get live video / broadcast #501

    Changes

    • Change the file format when using filter_locations_file to an ini file format #504

    Bug Fixes

    • Fix bug related to search-location when 'lat' or 'lng' are not found #505
    • Handles missing has_anonymous_profile_picture key from response #506
    Source code(tar.gz)
    Source code(zip)
  • v1.8.1(Feb 7, 2020)

    Bug Fixes

    • Fixes include_location argument error #489
    • Fixes typo during argument checking #489
    • Workaround for Bad Gateway error #488
    • Force using IPv4 when the machine supports IPv6 networks #493
    Source code(tar.gz)
    Source code(zip)
  • 1.8.0(Feb 1, 2020)

    New Features

    • Adds location filtering for tag and location scrap queries #483
    • Adds fetching up highlight stories #481

    Changes

    • Get profile metadata without login #477
    • Update Instagram version in STORIES_UA #474

    Bug Fixes

    • Fixes help message for --proxies #479
    Source code(tar.gz)
    Source code(zip)
  • 1.7.1(Aug 15, 2019)

  • 1.7.0(Aug 10, 2019)

  • v1.6.1(May 18, 2019)

  • 1.6.0(Mar 31, 2019)

    New Features

    • Adds --profile-metadata argument to enable scraping for profile metadata #341
    • Add ability to use proxy while using instagram session #348

    Changes

    • Make logging in as guest default behavior if no username/pw specified #327
    • Changes error log to info log since we can get profile information when user is private #356

    Bug Fixes

    • Fixes bug when using --latest parameter.
    Source code(tar.gz)
    Source code(zip)
  • v1.5.41(Dec 31, 2018)

    Bug Fixes

    • Fixes 403 Forbidden for url #316
    • Adds command-line option to store cookies at end of scraping and reload them in the next use #324
    Source code(tar.gz)
    Source code(zip)
  • v1.5.40(Oct 10, 2018)

  • v1.5.39(Aug 27, 2018)

  • 1.5.38(Aug 14, 2018)

  • 1.5.37(Jul 29, 2018)

    Changes

    • Avoids creation of empty directory if there is nothing to download #256
    • Uses query parameters if static url is forbidden and range requests handling improvement #263
    • Updates user-agents to a more recent version #265
    • Fixes --template not being applied to profile picture #271
    Source code(tar.gz)
    Source code(zip)
  • 1.5.36(May 27, 2018)

  • 1.5.35(May 25, 2018)

  • 1.5.34(May 24, 2018)

    Bug Fixes

    • Fix Login Csrftoken #233 #231 #234
    • Don't redownload the latest item with --latest-stamps #224
    • Avoid re-downloading profile pic with --latest-stamps #223
    Source code(tar.gz)
    Source code(zip)
  • 1.5.32(Apr 19, 2018)

  • 1.5.31(Apr 14, 2018)

    Bug Fixes

    • Due to changes to Instagram's API, authentication by supplying a login user (-u) and password (-p) arguments is now mandatory, adds x-instagram-gis header for authentication #210, #212, #213
    Source code(tar.gz)
    Source code(zip)
  • 1.5.28(Apr 8, 2018)

  • 1.5.27(Apr 4, 2018)

  • 1.5.26(Mar 27, 2018)

  • 1.5.25(Mar 24, 2018)

Owner
I love making things, collaborating, and solving interesting problems. Let's build something awesome together.
Receive GitHub webhook events and send to Telegram chats with AIOHTTP through Telegram Bot API

GitHub Webhook to Telegram Receive GitHub webhook events and send to Telegram chats with AIOHTTP through Telegram Bot API What this project do is very

Dash Eclipse 33 Jan 03, 2023
A fork of discord.py meant to replace it

Texus A modern, easy to use, feature-rich, and async ready API wrapper for Discord written in Python. Key Features Modern Pythonic API using async and

Texus 1 Nov 18, 2021
A Discord bot for viewing any currency you want comfortably.

Dost Dost is a Discord bot for viewing currencies. Getting Started These instructions will get you a copy of the project up and running on your local

Baran Gökalp 2 Jan 18, 2022
buys ethereum based on graphics card moving average price on ebay

ebay_trades buys ethereum based on graphics card moving average price on ebay Built as a meme, this application will scrape the first 3 pages of ebay

ConnorCreate 41 Jan 05, 2023
Amanda-A next gen powerful telegram group manager bot for manage your groups and have fun with other cool modules.

Amanda-A next gen powerful telegram group manager bot for manage your groups and have fun with other cool modules.

Team Amanda 4 Oct 21, 2022
Clippin n grafting Backend

Clipping' n Grafting Presenting you, 🎉 Clippin' n Grafting 🎉 , your very own ecommerce website displaying all your artsy-craftsy stuff. Not only the

Google-Developer-Student-Club-ISquareIT (GDSC I²IT) 2 Oct 22, 2021
A file-based quote bot written in Python

Let's Write a Python Quote Bot! This repository will get you started with building a quote bot in Python. It's meant to be used along with the Learnin

1 Feb 03, 2022
A free sniper bot built to work with PancakeSwap: Router V2

Pancakeswap Sniper Bot PancakeSwap sniper bot. Automated sniping bot to snipe crypto coin launches. How it works The sniping bot can be used in three

89 Aug 06, 2022
Wetterdienst - Open weather data for humans

We are a group of like-minded people trying to make access to weather data in Python feel like a warm summer breeze, similar to other projects like rdwd for the R language, which originally drew our

226 Jan 04, 2023
Policy and data administration, distribution, and real-time updates on top of Open Policy Agent

⚡ OPAL ⚡ Open Policy Administration Layer OPAL is an administration layer for Open Policy Agent (OPA), detecting changes to both policy and policy dat

8 Dec 07, 2022
🔍 Google Search unofficial API for Python with no external dependencies

Python Google Search API Unofficial Google Search API for Python. It uses web scraping in the background and is compatible with both Python 2 and 3. W

Avi Aryan 204 Dec 28, 2022
A Slash Commands Discord Bot created using Pycord!

Hey, I am Slash Bot. A Bot which works with Slash Commands! Prerequisites Python 3+ Check out. the requirements.txt and install all the pakages. Insta

Saumya Patel 18 Nov 15, 2022
The Easy-to-use Dialogue Response Selection Toolkit for Researchers

Easy-to-use toolkit for retrieval-based Chatbot Our released data can be found at this link. Make sure the following steps are adopted to use our code

GMFTBY 32 Nov 13, 2022
Yet another random discord bot.

YARDB (r!) Yet another fully functional and random discord bot. I might add more features if I'm bored also don't criticize on my code. Commands: 4 Di

kayle 1 Oct 21, 2021
The Social-Engineer Toolkit (SET) is specifically designed to perform advanced attacks against the human element.

The Social-Engineer Toolkit (SET) The Social-Engineer Toolkit (SET) is specifically designed to perform advanced attacks against the human element. SE

Professor 6 Nov 28, 2022
An advanced automatic top.gg dank memer voter that votes automatically for you.

Auto Dank Memer Voter An automatic dank memer voter that sends votes onto top.gg every 12 hours, unless their is captcha. I am working on a captcha de

6 Aug 27, 2022
Python Twitter API

Python Twitter Tools The Minimalist Twitter API for Python is a Python API for Twitter, everyone's favorite Web 2.0 Facebook-style status updater for

Mike Verdone 2.9k Jan 03, 2023
Track to Detect and Segment: An Online Multi-Object Tracker (CVPR 2021)

Track to Detect and Segment: An Online Multi-Object Tracker (CVPR 2021) Track to Detect and Segment: An Online Multi-Object Tracker Jialian Wu, Jiale

Jialian Wu 520 Dec 31, 2022
Prabashwara's Pm Bot repository. You can deploy and edit this repository.

Tᴇʟᴇɢʀᴀᴍ Pᴍ Bᴏᴛ | Prabashwara's PM Bot Unmaintained. The new repo of @Pm-Bot is private. (It is no longer based on this source code. The completely re

Rivibibu Prabshwara Ⓒ 2 Jul 05, 2022
Balanced API library in python.

Balanced Online Marketplace Payments v1.x requires Balanced API 1.1. Use v0.x for Balanced API 1.0. Installation pip install balanced Usage View Bala

Balanced 70 Oct 04, 2022