Console application for downloading images from Reddit in Python

Overview

RedditImageScraper

Console application for downloading images from Reddit in Python

Screenshot

Introduction

This short Python script was created for the mass-downloading of images from Reddit. It will be used later for creating data-sets for several Machine Learning projects.

In order to use the script, you will have to have a Reddit account sign-up to create a developer account. You will be assigned a client_id and client_secret which you have to enter in config.ini before you run the script.

Usage

The -r parameter provides a list of sub-reddits to search.
The -st parameter can specify the maximum number of images to download from each sub-reddit. Defaults to 1000.
The -t parameter can specify the total number of images to download across all sub-reddits before we abort. Defaults to 10000.
The -f parameter can specify the folder into which we download the images. Defaults to download.

For example, to download at-most 20 pictures from the dogpictures, dogswithjobs, GuiltyDogs, and dogs sub-reddits, aborting when we have 50 files in total, and saving the files to a folder titled dog, we would use the following:

python RedditImageScraper.py -r dogpictures dogswithjobs GuiltyDogs dog -st 20 -t 50 -f dogs

This should produce something like the following:

Downloading from dogpictures
3tdfpayjp1k51.jpg
cuz4a5np90k51.jpg
1e1g882z40k51.jpg
xX3OQgP.jpg
fuatv49mizj51.jpg
tk9khtspb1k51.jpg
pgbxxakm63k51.jpg
oeUI8Iy.jpg
vqklutghc1k51.jpg
1ctn7f0390k51.jpg
r7995a1dd3k51.jpg
qbjj06vaa0k51.jpg
q6nl05omyzj51.jpg
bl8bi5tsu2k51.jpg
gxc78spvxxj51.jpg
w0pdsr1hsyj51.jpg
9h19nq1k5vj51.jpg
y67tpittfyj51.jpg
Downloaded 18 from dogpictures

Downloading from dogswithjobs
5xxpn6xs7xj51.png
3mrgwnlum1k51.jpg
rs2uecgnb1k51.jpg
y077mg1974k51.jpg
kci6u8pc02k51.jpg
iho9wex0qrj51.jpg
109eyp6kjyj51.jpg
i86x3o6dutj51.jpg
Downloaded 8 from dogswithjobs

Downloading from GuiltyDogs
8z89s7a89dj51.jpg
c9rf2r516li51.jpg
pbdqr853rsh51.jpg
e9xihfbqdeh51.jpg
53gamygu9ch51.jpg
d3tq02dbbyg51.jpg
ifsmwutou2h51.jpg
Downloaded 7 from GuiltyDogs

Downloading from dog
1kloilrhc1k51.jpg
bwe1go65h1k51.jpg
8118vyqeg1k51.jpg
bajprhddg0k51.jpg
rlc7n4m6q0k51.jpg
z9p8llkuyyj51.jpg
dhdi10myx2k51.jpg
6zflnt9hozj51.jpg
niptrbxzf2k51.jpg
jxi3vrd901k51.jpg
u8eykob35yj51.jpg
5hwj8cce6zj51.png
9nr2t4f0vzj51.jpg
ozs8tuu7mzj51.jpg
0h1fwfqhh3k51.jpg
Downloaded 15 from dog

Downloaded 48 in total
Owner
James
Busily reinventing the wheel
James
Proxy scraper. Format: IP | PORT | COUNTRY | TYPE

proxy scraper 🔎 Installation: git clone https://github.com/ebankoff/proxy_scraper Required pip libraries (pip install library name): lxml beautifulso

Eban'ko 19 Dec 07, 2022
Scrapping the data from each page of biocides listed on the BAUA website into a csv file

Scrapping the data from each page of biocides listed on the BAUA website into a csv file

Eric DE MARIA 1 Nov 30, 2021
The first public repository that provides free BUBT website scraping API script on Github.

BUBT WEBSITE SCRAPPING SCRIPT I think this is the first public repository that provides free BUBT website scraping API script on github. When I was do

Md Imam Hossain 3 Feb 10, 2022
Scrapes proxies and saves them to a text file

Proxy Scraper Scrapes proxies from https://proxyscrape.com and saves them to a file. Also has a customizable theme system Made by nell and Lamp

nell 2 Dec 22, 2021
Html Content / Article Extractor, web scrapping lib in Python

Python-Goose - Article Extractor Intro Goose was originally an article extractor written in Java that has most recently (Aug2011) been converted to a

Xavier Grangier 3.8k Jan 02, 2023
Python script that reads Aliexpress offers urls from a Excel filename (.csv) and post then in a Telegram channel using a bot

Aliexpress to telegram post Python script that reads Aliexpress offers urls from a Excel filename (.csv) and post then in a Telegram channel using a b

Fernando 6 Dec 06, 2022
A multithreaded tool for searching and downloading images from popular search engines. It is straightforward to set up and run!

🕳️ CygnusX1 Code by Trong-Dat Ngo. Overviews 🕳️ CygnusX1 is a multithreaded tool 🛠️ , used to search and download images from popular search engine

DatNgo 32 Dec 31, 2022
A Simple Web Scraper made to Extract Download Links from Todaytvseries2.com

TDTV2-Direct Version 1.00.1 • A Simple Web Scraper made to Extract Download Links from Todaytvseries2.com :) How to Works?? install all dependancies v

Danushka-Madushan 1 Nov 28, 2021
优化版本的京东茅台抢购神器

优化版本的京东茅台抢购神器

1.8k Mar 18, 2022
Dailyiptvlist.com Scraper With Python

Dailyiptvlist.com scraper Info Made in python Linux only script Script requires to have wget installed Running script Clone repository with: git clone

1 Oct 16, 2021
A tool can scrape product in aliexpress: Title, Price, and URL Product.

Scrape-Product-Aliexpress A tool can scrape product in aliexpress: Title, Price, and URL Product. Usage: 1. Install Python 3.8 3.9 padahal halaman ins

Rahul Joshua Damanik 1 Dec 30, 2021
Get-web-images - A python code that get images from any site

image retrieval This is a python code to retrieve an image from the internet, a

CODE 1 Dec 30, 2021
热搜榜-python爬虫+正则re+beautifulsoup+xpath

仓库简介 微博热搜榜, 参数wb 百度热搜榜, 参数bd 360热点榜, 参数360 csdn热榜接口, 下方查看 其他热搜待加入 如何使用? 注册vercel fork到你的仓库, 右上角 点击这里完成部署(一键部署) 请求参数 vercel配置好的地址+api?tit=+参数(仓库简介有参数信息

Harry 3 Jul 08, 2022
Scrape data on SpaceX: Capsules, Rockets, Cores, Roadsters, SpaceX Info

SpaceX Sofware I developed software to scrape data on SpaceX: Capsules, Rockets, Cores, Roadsters, SpaceX Info to use the software you need Python a

Maxence Rémy 16 Aug 02, 2022
🥫 The simple, fast, and modern web scraping library

About gazpacho is a simple, fast, and modern web scraping library. The library is stable, actively maintained, and installed with zero dependencies. I

Max Humber 692 Dec 22, 2022
联通手机营业厅自动做任务、签到、领流量、领积分等。

联通手机营业厅自动完成每日任务,领流量、签到获取积分等,月底流量不发愁。 功能 沃之树领流量、浇水(12M日流量) 每日签到(1积分+翻倍4积分+第七天1G流量日包) 天天抽奖,每天三次免费机会(随机奖励) 游戏中心每日打卡(连续打卡,积分递增至最高

2k May 06, 2021
A Python library for automating interaction with websites.

Home page https://mechanicalsoup.readthedocs.io/ Overview A Python library for automating interaction with websites. MechanicalSoup automatically stor

4.3k Jan 07, 2023
This program will help you to properly scrape all data from a specific website

This program will help you to properly scrape all data from a specific website

MD. MINHAZ 0 May 15, 2022
Screenhook is a script that captures an image of a web page and send it to a discord webhook.

screenshot from the web for discord webhooks screenhook is a script that captures an image of a web page and send it to a discord webhook.

Toast Energy 3 Jun 04, 2022
Scrape puzzle scrambles from csTimer.net

Scroodle Selenium script to scrape scrambles from csTimer.net csTimer runs locally in your browser, so this doesn't strain the servers any more than i

Jason Nguyen 1 Oct 29, 2021