An automated, headless YouTube Watcher and Scraper

Overview

Logo

MIT License Code style: black
Platform: YouTube Uses Docker Automation supporting Firefox and Chrome
Uses Tor. And an emoji of an onion, get it? Firefox supported Chrome supported

An automated, headless YouTube Watcher and Scraper

Authors: Christian C., Moritz M., Luca S.
Related Projects: YouTube Uploader, Twitch Compilation Creator, Neural Networks


About

Searches YouTube, queries recommended videos and watches them. All fully automated and anonymised through the Tor network. The project consists of two independently usable components, the YouTube automation written in Python and the dockerized Tor Browser.

This project is for educational purposes only. Using Tor to watch YouTube videos is strongly discouraged, especially for Botting purposes. Please inform yourself about the Tor network, before using it extensively.

Setup

YouTube Automation

This project requires Poetry to install the required dependencies. Check out this link to install Poetry on your operating system.

Make sure you have installed Python 3.8! Otherwise Step 3 will let you know that you have no compatible Python version installed.

  1. Clone/Download this repository
  2. Navigate to the root of the repository
  3. Run poetry install to create a virtual environment with Poetry
  4. Either run the dockerized Browser with docker-compose up, install geckodriver for a local Firefox or ChromeDriver for Chromium. Ensure that geckodriver/ChromeDriver are in a location in your $PATH.
  5. Run poetry run python main.py to run the program. Alternatively you can run poetry shell followed by python main.py. By default this connects to the dockerized Browser. To automate a different Browser use the --browser [chrome/firefox] command line option.

Dockerized Tor Browser

Running the Container requires Docker and docker-compose.

  1. Clone/Download this repository

  2. Navigate to the root of the repository

  3. Run docker-compose up. The image will be built automatically before startup.

  4. Selenium can now connect to the browser via port 4444. In Python the connection can be established with the following command.

    driver = webdriver.Remote(
        command_executor="http://127.0.0.1:4444/wd/hub",
        desired_capabilities=options,
    )

    See main.py for more information.

Run Parameters

All of these parameters are optional and a default value will be used if they are not defined. You can also get these definitions by running main.py --help

usage: main.py [-h] [-B {docker,chrome,firefox}] [-t] [--disable-tor] -s SEARCH_TERMS [-c CHANNEL_URL]

optional arguments:
  -h, --help            show this help message and exit
  -B {docker,chrome,firefox}, --browser {docker,chrome,firefox}
                        Select the driver/browser to use for executing the script.
  -t, --enable-tor      Enables Tor usage by connecting to a proxy on localhost:9050. Only usable with the docker
                        executor.
  --disable-tor         Disables the Tor proxy.
  -s SEARCH_TERMS, --search-terms SEARCH_TERMS
                        This argument declares a list of search terms which get viewed.
  -c CHANNEL_URL, --channel-url CHANNEL_URL
                        Channel URL if not declared it uses Golden Gorillas channel URL as default.
中国大学生在线 四史自动答题刷分(现仅支持英雄篇)

中国大学生在线 “四史”学习教育竞答 自动答题 刷分 (现仅支持英雄篇,已更新可用) 若对您有所帮助,记得点个Star 🌟 !!! 中国大学生在线 “四史”学习教育竞答 自动答题 刷分 (现仅支持英雄篇,已更新可用) 🥰 🥰 🥰 依赖 本项目依赖的第三方库: requests 在终端执行以下

XWhite 229 Dec 12, 2022
A repository with scraping code and soccer dataset from understat.com.

UNDERSTAT - SHOTS DATASET As many people interested in soccer analytics know, Understat is an amazing source of information. They provide Expected Goa

douglasbc 48 Jan 03, 2023
The core packages of security analyzer web crawler

Security Analyzer 🐍 A large scale web crawler (considered also as vulnerability scanner tool) to take an overview about security of Moroccan sites Cu

Security Analyzer 10 Jul 03, 2022
Crawl the information of a given keyword on Google search engine

Crawl the information of a given keyword on Google search engine

4 Nov 09, 2022
A high-level distributed crawling framework.

Cola: high-level distributed crawling framework Overview Cola is a high-level distributed crawling framework, used to crawl pages and extract structur

Xuye (Chris) Qin 1.5k Jan 04, 2023
An arxiv spider

An Arxiv Spider 做为一个cser,杰出男孩深知内核对连接到计算机上的硬件设备进行管理的高效方式是中断而不是轮询。每当小伙伴发来一篇刚挂在arxiv上的”热乎“好文章时,杰出男孩都会感叹道:”师兄这是每天都挂在arxiv上呀,跑的好快~“。于是杰出男孩找了找 github,借鉴了一下其

Jie Liu 11 Sep 09, 2022
for those who dont want to pay $10/month for high school game footage with ads

nfhs-scraper Disclaimer: I am in no way responsible for what you choose to do with this script and guide. I do not endorse avoiding paywalls or any il

Conrad Crawford 5 Apr 12, 2022
Telegram group scraper tool

Telegram Group Scrapper

Wahyusaputra 2 Jan 11, 2022
This scrapper scrapes the mail ids of faculty members from a given linl/page and stores it in a csv file

This scrapper scrapes the mail ids of faculty members from a given linl/page and stores it in a csv file

Devansh Singh 1 Feb 10, 2022
Automatically scrapes all menu items from the Taco Bell website

Automatically scrapes all menu items from the Taco Bell website. Returns as PANDAS dataframe.

Sasha 2 Jan 15, 2022
抢京东茅台脚本,定时自动触发,自动预约,自动停止

jd_maotai 抢京东茅台脚本,定时自动触发,自动预约,自动停止 小白信用 99.6,暂时还没抢到过,朋友 80 多抢到了一瓶,所以我感觉是跟信用分没啥关系,完全是看运气的。

Aruelius.L 117 Dec 22, 2022
Binance Smart Chain Contract Scraper + Contract Evaluator

Pulls Binance Smart Chain feed of newly-verified contracts every 30 seconds, then checks their contract code for links to socials.Returns only those with socials information included, and then submit

14 Dec 09, 2022
Python web scrapper

Website scrapper Web scrapping project in Python. Created for learning purposes. Start Install python Update configuration with websites Launch script

Nogueira Vitor 1 Dec 19, 2021
Tool to scan for secret files on HTTP servers

snallygaster Finds file leaks and other security problems on HTTP servers. what? snallygaster is a tool that looks for files accessible on web servers

Hanno Böck 2k Dec 28, 2022
Bulk download tool for the MyMedia platform

MyMedia Bulk Content Downloader This is a bulk download tool for the MyMedia platform. USE ONLY WHERE ALLOWED BY THE COPYRIGHT OWNER. NOT AFFILIATED W

Ege Feyzioglu 3 Oct 14, 2022
A web crawler for recording posts in "sina weibo"

Web Crawler for "sina weibo" A web crawler for recording posts in "sina weibo" Introduction This script helps collect attributes of posts in "sina wei

4 Aug 20, 2022
Create crawler get some new products with maximum discount in banimode website

crawler-banimode create crawler and get some new products with maximum discount in banimode website. این پروژه کوچک جهت یادگیری و کار با ابزار سلنیوم

nourollah rezaei 2 Feb 17, 2022
Luis M. Capdevielle 1 Jan 14, 2022
Script used to download data for stocks.

This script is useful for downloading stock market data for a wide range of companies specified by their respective tickers. The script reads in the d

Carmelo Gonzales 71 Oct 04, 2022
A Python Covid-19 cases tracker that scrapes data off the web and presents the number of Cases, Recovered Cases, and Deaths that occurred because of the pandemic.

A Python Covid-19 cases tracker that scrapes data off the web and presents the number of Cases, Recovered Cases, and Deaths that occurred because of the pandemic.

Alex Papadopoulos 1 Nov 13, 2021