Python script to download all images/webms of a 4chan thread

Overview

4chan-downloader

Python script to download all images/webms of a 4chan thread

Download Script

The main script is called inb4404.py and can be called like this: python inb4404.py [thread/filename]

usage: inb4404.py [-h] [-c] [-d] [-l] [-n] [-r] thread

positional arguments:
  thread              url of the thread (or filename; one url per line)

optional arguments:
  -h, --help          show this help message and exit
  -c, --with-counter  show a counter next the the image that has been
                      downloaded
  -d, --date          show date as well
  -l, --less          show less information (surpresses checking messages)
  -n, --use-names     use thread names instead of the thread ids
                      (...4chan.org/board/thread/thread-id/thread-name)
  -r, --reload        reload the queue file every 5 minutes
  -t, --title         save original filenames

You can parse a file instead of a thread url. In this file you can put as many links as you want, you just have to make sure that there's one url per line. A line is considered to be a url if the first 4 letters of the line start with 'http'.

If you use the --use-names argument, the thread name is used to name the respective thread directory instead of the thread id.

Thread Watcher

This is a work-in-progress script but basic functionality is already given. If you call the script like

python thread-watcher.py -b vg -q mhg -f queue.txt -n "Monster Hunter"

then it looks for all threads that include mhg inside the vg board, stores the thread url into queue.txt and adds /Monster-Hunter at the end of the url so that you can use the --use-names argument from the actual download script.

Legacy

The current scripts are written in python3, in case you still use python2 you can use an old version of the script inside the legacy directory.

Comments
  • "Something went wrong"

    Regardless of thread, board, script location, drive letter, attribute, or host machine the script returns "Something went wrong" and times out after several tries. image Above picture showing different threads and boards and different attribute flags, all three failing almost immediately. image Interestingly however it works fine on an arch install. I've tried it on two machines running Windows 10, reinstalled python 3 as well Not entirely sure why it's dropping almost immediately on Windows. Assume this is a Windows problem, but nothing changed since I used it yesterday. will close if I find a solution to using it on windows. sorry there isn't much information to go off of.

    opened by cherioux 16
  • multiple threads from file error

    multiple threads from file error

    File "Python\Python38-32\lib\multiprocessing\process.py", line 315, in _bootstrap self.run() Process Process-1: File "Python\Python38-32\lib\multiprocessing\process.py", line 108, in run self._target(*self._args, **self._kwargs) File "4chdl\inb4404.py", line 44, in download_thread if args.use_names or os.path.exists(os.path.join(workpath, 'downloads', board, thread_tmp)): AttributeError: 'NoneType' object has no attribute 'use_names' Traceback (most recent call last): File "Python\Python38-32\lib\multiprocessing\process.py", line 315, in _bootstrap self.run() File "Programs\Python\Python38-32\lib\multiprocessing\process.py", line 108, in run self._target(*self._args, **self._kwargs) File "4chdl\inb4404.py", line 44, in download_thread if args.use_names or os.path.exists(os.path.join(workpath, 'downloads', board, thread_tmp)): AttributeError: 'NoneType' object has no attribute 'use_names'

    opened by th3illu 8
  • Filenames...

    Filenames...

    I've run into a few issues with filenames. One of them isn't entirely Windows specific, and I have an idea of what needs to be done to fix it, but no idea how to. Duplicate filenames within a thread simply overwrite any preceding files. Usually Spoiler_Image or file.png.

    The other issue is Windows specific and I've managed to solve it at least locally by:

    from django.utils.text import get_valid_filename
    ...snip...
                    img_path = ntpath.join(directory, get_valid_filename(img))
    ...snip...
    

    The issue in question, filenames like this used to make the script halt on Windows:

    C:\vtai>inb4404.py -c -d -l -t https://boards.4channel.org/vt/thread/34806758
    [2022-10-11 07:50:46 PM] [  9/278] vt/34806758/[sound=files.catbox.moe%2F7a2f1v.m4a]{{takanashi kiara}}, {{{1girl}}}, {begging},[[[lipstick]]],[[[lip gloss]]],kusogaki,closed eyes,{{pov}}, {{incoming kiss}}, close-up, {{horny}}, blushing, solo, puffy sleeves, orange skirt, aqua choker, orange hair, ba.png
    Traceback (most recent call last):
      File "C:\vtai\inb4404.py", line 171, in <module>
        main()
      File "C:\vtai\inb4404.py", line 31, in main
        download_thread(thread, args)
      File "C:\vtai\inb4404.py", line 112, in download_thread
        with open(img_path, 'wb') as f:
    OSError: [Errno 22] Invalid argument: 'C:\\vtai\\downloads\\vt\\34806758\\[sound=files.catbox.moe%2F7a2f1v.m4a]{{takanashi kiara}}, {{{1girl}}}, {begging},[[[lipstick]]],[[[lip gloss]]],kusogaki,closed eyes,{{pov}}, {{incoming kiss}}, close-up, {{horny}}, blushing, solo, puffy sleeves, orange skirt, aqua choker, orange hair, ba.png'
    

    After importing it appears to work.

    opened by vt-idiot 5
  • Script can't download files from a thread with a other thread quoted (Cross-thread)

    Script can't download files from a thread with a other thread quoted (Cross-thread)

    NSFW

    How to reproduce:

    1° Try to download from this link: https://boards.4chan.org/gif/thread/13521437

    This error appears

    File TTT Part 2 Top Tier Titties \1527053877481.webm from https://i.4cdn.org/gif/1536882732639.webm UNKNOWN ERROR OCCURRED [WinError 183] cannot create an existing file: 'C:\\Users\\mario\\4channer\\TTT Part 2 Top Tier Titties '

    But the "TTT Part 2 Top Tier Tittiles" and the two .webm files doens't exist. Using a custom folder location doens't work.

    opened by ghost 5
  • Corrupt downloads

    Corrupt downloads

    Hi! Just downloaded your script but every file downloaded is corrupt. I picked a random thread https://boards.4chan.org/b/thread/573855146/pink-ids-will-have-pink-eye-for-the-rest-of-their, but also tried with some more with same result.

    Is there any kind of logging i can enable to help you sort this out, if you're not experiencing the same problem?

    Cheers! Luca

    opened by LucaNonato 4
  • auto download/ all images in one folder?

    auto download/ all images in one folder?

    i want all the images of a certian tag to go into just one folder instead of each individual one, is there a way to do that? also is there a feature to autodownload threads in the queue text file when one appears

    question 
    opened by azzerzzzeqwe 3
  • Logo for the Downloader Readme

    Logo for the Downloader Readme

    Hey, i would like to propose a Logo for the readme header and maybe even as icon for a PyQt GUI-version if that one ever gets a go. I have altered the original 4chan Logo.

    opened by ThisLimn0 2
  • invalid syntax?

    invalid syntax?

    I guess I miss something very simple, but cannot figure it out, so here goes nothing

    I followed the instructions to in README.md and typed

    $ python inb4404.py http://boards.4chan.org/w/thread/2096591/lain-thread-anyone-have-any-arisu

    The only result was:

    File "inb4404.py", line 129 print(line.replace(link, '-' + link), end='') ^ SyntaxError: invalid syntax

    And I'm too dumb to know if this is a typo to fix or I should use a different URL (direct one to the first pic in thread maybe?).

    (After hundreds of shameful edits): also the output points at end='' equation, but for love of god I have no idea how to paste it here so markdown does not cut out all spaces.

    opened by tcheerno 2
  • Rewrite/Convert to python3?

    Rewrite/Convert to python3?

    The 2to3 tool should automate this as far as possible. I think the script should be converted to python3 because this is the new standard, support for python2 will be dropped in near future.

    opened by KopfKrieg 2
  • Stuck on checking the first post of the thread

    Stuck on checking the first post of the thread

    The script keeps "checking" the first post of the thread specified as thread_link

    I'm using python 2.7.13, script doesn't work with python 3 because of an error at line 73.

    opened by elleborgo 2
  • Reloading the file doesn't work properly

    Reloading the file doesn't work properly

    Single thread and multiple thread (using a file) works just fine, but the script is unable to reload the file and update the queue properly as of now.

    opened by Exceen 2
  • Configurable threads

    Configurable threads

    I added a lot between commits, which was a mistake, but the rundown is that I re-worked the threaded downloads.

    Previously each 4chan thread would get its own process to check and download new images. I created a queue (manager.list) system and made static workers/processes. By default, 4 will be created, but that is configurable with -p. As 'jobs' are pulled from the queue, a worker thread will work on it, then wait until another job is available to pull from the queue.

    Let me know if you have any questions or want me to change anything.

    Thanks!

    opened by Zand3r24 4
Releases(1.0)
  • 1.0(Jul 23, 2018)

    • Supports downloading images and webm-files of single and multiple (multi-threading) 4chan-threads with continuous checks for new posts.
    • Assign names to threads to locate them easier on your hard drive
    • When using a file to download multiple threads at once, 404'd threads will be marked automatically with a "-" in front of the URL.
    • Separates downloads into a "download" directory which serves as archive and a "new" directory. Downloaded files are put into both directories. If a files is deleted inside the "download" directory is will be downloaded again. On the other hand, if a file inside the "new" directory is deleted it won't be downloaded again. This serves as an easy way to keep a whole thread archived and to track new downloads. Therefore, deleting a file inside the "new" directory serves as some kind of a "mark as read" feature.
    • Thread Watcher is more or less an Add-On for the actual download file which checks for threads including a specified text and adds the respective URL into a text file which can be used with the download script.
    Source code(tar.gz)
    Source code(zip)
Owner
Micha Fink
Micha Fink
A Telegram bot to download Subtitle for movies and tv shows.

Subtitle Downloader Bot A Telegram bot to download Subtitle for movies and tv shows. Host on Heroku Configuring Environments API_HASH : Your Telegram

Joy Biswas 15 Nov 12, 2022
Simple Python script to download images and videos from public subreddits without using Reddit's API 😎

Subreddit Media Downloader Download images and videos from any public subreddit without using Reddit's API Made with ❤ by Nico 💬 About: This script a

Nico 106 Jan 07, 2023
Download Thumbnail of YouTube Videos

Download Thumbnail of YouTube Videos in High Quality Variables: API_ID : Get From my.telegram.org API_HASH : Get from my.telegram.org BOT_TOKEN : Your

Arun 6 Jun 08, 2022
🔥 A Bot To Telegram For Download High Qulity Videos & Songs From Youtube

🔥 A Bot To Telegram For Download High Qulity Videos & Songs From Youtube 🎗 Fast And Free Bot No Need To Pay ✅ By SL-Alpha-X-Team ⚡

Official Alpha-X-Team Account 7 Aug 31, 2022
Download the resources of the Blue Archive easily!

blue-archive-bundle-downloader Download the resources of the Blue Archive easily! Known issue In Windows It works only if the console is "fullscreen"

Ryu juheon 7 Apr 08, 2022
Simple avogadr.io batch downloader python script

Simple avogadr.io batch downloader python script

2 Jan 19, 2022
A downloader for Cave Story written in Python

Cave Story Downloader This is a downloader for Cave Story written in Python. Thi

Imsad2 2 Feb 16, 2022
YoutubeDownloader - Repo for downloading YT audio and videos

YoutubeDownloader Downloads video/playlist/audio from youtube url. install all t

Anuj SP 2 Feb 17, 2022
MMDL (Mega Music Downloader) - A tool to easily download music.

mmdl - Mega Music Downloader What is mmdl ❓ TLDR: MMDL is a cli app which allows you to quickly and efficiently download one or multiple songs from Yo

techboy-coder 30 Dec 13, 2022
Simple Youtube Video Downloader

Simple Youtube Video Downloader Download Youtube video using link and Will output result in D:/ (You can change the path in main.py file) Installation

Hansen Gianto 1 Oct 28, 2021
Download YouTube videos/music and images in MP4, JPG with this tool.

ABOUT THE TOOL Download YouTube videos, music and images in MP4, JPG with this tool, with an easy to understand interface. This tool works with both,

TrollSkull 5 Jan 02, 2023
Implementation of Cross-category Video Highlight Detection via Set-based Learning (ICCV 2021).

Cross-category Video Highlight Detection via Set-based Learning Introduction This project is an implementation of ``Cross-category Video Highlight Det

Minghao (Alan) Xu 49 Dec 17, 2022
Download all games from a public Itch.io Game Jam

Itch Jam Downloader Downloads all games from a public Itch.io Game Jam. What you'll need: Python 3.8+ pip install -r requirements.txt For site mirrori

Dragoon Aethis 19 Dec 07, 2022
A simple Python +3.x script to download videos from Facebook.

Facebook Video Downloader A simple Python +3.x script to download videos from Facebook posts

Kerolos Atef Saber 1 Dec 03, 2021
Download all games from a public Itch.io Game Jam

Itch Jam Downloader Downloads all games from a public Itch.io Game Jam. What you'll need: Python 3.8+ pip install -r requirements.txt For site mirrori

Dragoon Aethis 19 Dec 07, 2022
Pytube ve tkinter kütüphanesi ile yapmış olduğum basit ve temel bir youtube video indirme programı.

PyTube Pytube ve tkinter kütüphanesi ile yapmış olduğum basit ve temel bir youtube video indirme programı. Videolar 720p çözünürlükte indirilmektedir.

1 Nov 12, 2021
Python Program that downloads gaming required packages based on your Linux Distribution.

LibreGaming Python Program that downloads gaming required packages based on your Linux Distribution. Table of contents Distributions Prerequisites Dep

Ahmed Al Balochi 195 Jan 01, 2023
Utility for downloading works from AO3 (Archive Of Our Own)

ao3d video preview A small graphical utility for batch downloading works from AO3 (Archive Of Our Own) Features Batch downloading works to supported f

flux 24 Dec 09, 2022
Noto fonts go universal! Download Noto fonts combined to suit your region (South Asia, SE Asia, Africa-MiddleEast, Europe-Americas).

Go Noto Universal Noto fonts go universal! Download Noto fonts combined to suit your region (South Asia, SE Asia, East Asia, Africa-MiddleEast, Europe

Satish B 67 Jan 06, 2023
Twayback: Downloading deleted Tweets from the Wayback Machine, made easy

Finding and downloading deleted Tweets takes a lot of time. Thankfully, with this tool, it becomes a piece of cake! 🎂

126 Dec 27, 2022