Using Selenium with Python to Web Scrap Popular Youtube Tech Channels.

Last update: Aug 18, 2021

Overview

Web Scrapping Popular Youtube Tech Channels with Selenium

Data Mining, Data Wrangling, and Exploratory Data Analysis

About the Data

Web scraping was performed on the Top 10 Tech Channels on Youtube using Selenium (an automated browser (driver) controlled using python, which is often used in web scraping and web testing). Web scrapped Youtube channels were were determined using a Top 10 Tech Youtubers list from blog.bit.ai.

All data was saved to multiple CSV files to aid in further analyze on a Google Colab notebook. Please see my for more more details.

Sample of Data Collected

The average number of videos per channel was around 200. In total, the data from 2000 videos was scrapped.

Word Cloud of Word Frequency in Video Titles

Take Aways

Video Comment numbers have very little correlation to any data that was obtained in this project.
The following seem to be seems to be highly correlated.
- Channel Views and Subscribers
- Interactions and Video Views
Video titles fall into 5 topic groups.

Kmeans and PCA used to create clusters for video titles
- Iphone (kmeans 0)
- Samsung (kmeans 1)
- Reviews (kmeans 2)
- Unboxing (kmeans 3)
- How-to (kmeans 4)
70% of the the most viewed videos are about phones.
Join Date (Date a Youtube Channel was created) does not seem to have any relationship to number of subscribers or overall cha

Project Links

"Data Analysis of Youtube Tech Channels"

Using Selenium with Python to Web Scrap Popular Youtube Tech Channels.

Related tags

Overview

Web Scrapping Popular Youtube Tech Channels with Selenium

Data Mining, Data Wrangling, and Exploratory Data Analysis

About the Data

Sample of Data Collected

Word Cloud of Word Frequency in Video Titles

Take Aways

Kmeans and PCA used to create clusters for video titles

Project Links

Owner

David Rusho

robobrowser - A simple, Pythonic library for browsing the web without a standalone web browser.

Pythonic Crawling / Scraping Framework based on Non Blocking I/O operations.

自动完成每日体温上报（Github Actions）

Nekopoi scraper using python3

Web scraper build using python.

Scrape puzzle scrambles from csTimer.net

This tool can be used to extract information from any website

A high-level distributed crawling framework.

Consulta de CPF e CNPJ na Receita Federal com Web-Scraping

A web crawler script that crawls the target website and lists its links

An automated, headless YouTube Watcher and Scraper

A Python Oriented tool to Scrap WhatsApp Group Link using Google Dork it Scraps Whatsapp Group Links From Google Results And Gives Working Links.

Python script for crawling ResearchGate.net papers✨⭐️📎

A Simple Web Scraper made to Extract Download Links from Todaytvseries2.com

Iptvcrawl - A scrapy project for crawl IPTV playlist

🤖 Threaded Scraper to get discord servers from disboard.org written in python3

A high-level distributed crawling framework.

Find papers by keywords and venues. Then download it automatically

Extract gene TSS site form gencode/ensembl/gencode database GTF file and export bed format file.

A list of Python Bots used to extract data from several websites