爬虫案例合集。包括但不限于《淘宝、京东、天猫、豆瓣、抖音、快手、微博、微信、阿里、头条、pdd、优酷、爱奇艺、携程、12306、58、搜狐、百度指数、维普万方、Zlibraty、Oalib、小说、招标网、采购网、小红书》

Last update: Jan 05, 2023

Overview

lxSpider

爬虫案例合集。包括但不限于《淘宝、京东、天猫、豆瓣、抖音、快手、微博、微信、阿里、头条、pdd、优酷、爱奇艺、携程、12306、58、搜狐、百度指数、维普万方、Zlibraty、Oalib、小说网站、招标采购网》

简介：

时光荏苒，记不清写了多少案例了。作者文章发布在csdn，代码随后往github上更新。csdn部分文章为收费案例，合理订阅。

声明：

本库以教学为基准、本库提供的可操作性不得用于任何商业用途和违法违规场景。
作者对任何原因在使用本库中提供的代码和策略时可能对用户自己或他人造成的任何形式的损失和伤害不承担责任。
因本库引起的或与之有关的任何争议，各方应友好协商解决，协商不成的任何后果与作者无关。

专栏

网络爬虫基础：适合有python语法基础准备学爬虫的同学

web逆向基础：有爬虫经验即可（包含猿人学爬虫题目解析）

安卓逆向基础：工具介绍、逆向记录、案例分享

爬虫案例合集：付费专栏、经典案例、持续更新

博客

交流

Releases(快手弹幕采集工具)

快手弹幕采集工具(Jan 30, 2021)
使用说明：

1、启动dist目录下的run.exe程序。

2、填入主播uid，你的cookie，房间id

3、点击启动后，等待即可，不可重复点击。

4、需要确认主播当前是否还在直播。

参数获取：

主播uid：浏览器上的网址最后一个参数。

比如网址为： https://live.kuaishou.com/u/yingjia2019

主播的uid为： yingjia2019

你的cookie：

1、打开控制台，鼠标右键点击审查元素或者按F12.

2、点击控制台的Network。

3、刷新页面，可已按F5刷新

4、找到和主播uid一样html文件，然后点击右侧的headers

5、鼠标划到最下面找到cookie一行。复制里面的did=web_xxxxxxxxxxxxxx;

6、需要在软件上填入的cookie是 web_xxxxxxxxxxxxxx

房间id：

1、点击控制台的 Elements，按ctrl+F，打开搜索框。输入： live-stream-id

2、复制 live-stream-id="Zo9Upaz8w90"

3、要输入的房间id是 Zo9Upaz8w90

运行时最好保持页面打开，关闭页面后过一段时间会导致cookie失效。

此工具以学习为主，禁止滥用
Source code(tar.gz)
Source code(zip)
default.rar(21.47 MB)
小说下载器(Feb 2, 2021)
简介

1、小说下载(优势：速度快，直接从网络上搜集完整txt文件速度快) 2、在线小说爬取(优势：资源全，已上架的小说几乎都能找到)

特别声明:

本脚本仅用于测试和学习研究，禁止用于商业用途，不能保证其合法性，准确性，完整性和有效性，请根据情况自行判断。

本项目内所有资源文件，禁止任何公众号、自媒体进行任何形式的转载、发布。

本项目内任何脚本问题概不负责，包括但不限于由任何脚本错误导致的任何损失或损害.

请勿将项目的任何内容用于商业或非法目的，否则后果自负。

本项目遵循GPL-3.0 License协议，如果本特别声明与GPL-3.0 License协议有冲突之处，以本特别声明为准。

Source code(tar.gz)
Source code(zip)
default.zip(44.16 MB)

Owner

lx

Every noble work is at first impossible.

GitHub Repository

Python script for crawling ResearchGate.net papers✨⭐️📎

ResearchGate Crawler Python script for crawling ResearchGate.net papers About the script This code start crawling process by urls in start.txt and giv

4 Aug 30, 2022

An utility library to scrape data from TikTok, Instagram, Twitch, Youtube, Twitter or Reddit in one line!

Social Media Scraper An utility library to scrape data from TikTok, Instagram, Twitch, Youtube, Twitter or Reddit in one line! Go to the website » Vie

2 Aug 03, 2022

Google Scholar Web Scraping

Google Scholar Web Scraping This is a python script that asks for a user to input the url for a google scholar profile, and then it writes publication

1 Dec 12, 2021

Crawler in Python 3.7, 3.8. 3.9. Pypy3

Description Python Crawler written Python 3. (Supports major Python releases Python3.6, Python3.7 and Python 3.8) Installation and Use Setup VirtualEn

2 Mar 12, 2022

A simple python web scraper.

Dissec A simple python web scraper. It gets a website and its contents and parses them with the help of bs4. Installation To install the requirements,

11 May 06, 2022

Amazon web scraping using Scrapy Framework

Amazon-web-scraping-using-Scrapy-Framework Scrapy Scrapy is an application framework for crawling web sites and extracting structured data which can b

1 Jan 25, 2022

Example of scraping a paginated API endpoint and dumping the data into a DB

Provider API Scraper Example Example of scraping a paginated API endpoint and dumping the data into a DB. Pre-requisits Python = 3.9 Pipenv Setup # i

1 Oct 20, 2021

CreamySoup - a helper script for automated SourceMod plugin updates management.

CreamySoup/"Creamy SourceMod Updater" (or just soup for short), a helper script for automated SourceMod plugin updates management.

3 Jan 03, 2022

Web Crawlers for Data Labelling of Malicious Domain Detection & IP Reputation Evaluation

Web Crawlers for Data Labelling of Malicious Domain Detection & IP Reputation Evaluation This repository provides two web crawlers to label domain nam

1 Nov 05, 2021

Web scraper build using python.

Web Scraper This project is made in pyhthon. It took some info. from website list then add them into data.json file. The dependencies used are: reques

2 Jul 22, 2022

Html Content / Article Extractor, web scrapping lib in Python

Python-Goose - Article Extractor Intro Goose was originally an article extractor written in Java that has most recently (Aug2011) been converted to a

3.8k Jan 02, 2023

Binance Smart Chain Contract Scraper + Contract Evaluator

Pulls Binance Smart Chain feed of newly-verified contracts every 30 seconds, then checks their contract code for links to socials.Returns only those with socials information included, and then submit

14 Dec 09, 2022

爱奇艺会员,腾讯视频,哔哩哔哩,百度,各类签到

My-Actions 个人收集并适配Github Actions的各类签到大杂烩不要fork了 ⭐️ star就行使用方式新建仓库并同步代码点击Settings - Secrets - 点击绿色按钮 (如无绿色按钮说明已激活。直接到下一步。) 新增 new secret 并设置 Secr

280 Dec 30, 2022

A Smart, Automatic, Fast and Lightweight Web Scraper for Python

AutoScraper: A Smart, Automatic, Fast and Lightweight Web Scraper for Python This project is made for automatic web scraping to make scraping easy. It

4.8k Jan 04, 2023

crypto currency scraping

SCRYPTO What ? Crypto currencies scraping (At the moment, only bitcoin and ethereum crypto currencies are supported) How ? A python script is running

15 Sep 01, 2022

EBay-email-tracker - Scapes an entire search page of a particular item on eBay and sends regular updates to an email address

Introduction This is a project I built with the sole intent to learn more about

1 Jan 14, 2022

爬虫案例合集。包括但不限于《淘宝、京东、天猫、豆瓣、抖音、快手、微博、微信、阿里、头条、pdd、优酷、爱奇艺、携程、12306、58、搜狐、百度指数、维普万方、Zlibraty、Oalib、小说、招标网、采购网、小红书》

Related tags

Overview

lxSpider

专栏

目录

博客

推荐

交流

You might also like...

Releases(快手弹幕采集工具)

快手弹幕采集工具(Jan 30, 2021)

使用说明：

参数获取：

你的cookie：

房间id：

小说下载器(Feb 2, 2021)

简介

特别声明: