download NCERT books using scrapy

Overview

download_ncert_books

download NCERT books using scrapy

NCERT_CLASS_1 NCERT_CLASS_2 NCERT_CLASS_3 NCERT_CLASS_4 NCERT_CLASS_5 NCERT_CLASS_6 NCERT_CLASS_7 NCERT_CLASS_8 NCERT_CLASS_9 NCERT_CLASS_10 NCERT_CLASS_11 NCERT_CLASS_12

Downloading Books:

You can either use the spider by cloning this repo and following the instructions given below
or
You can download the books direcly from the release section or by clicking on the badges above

There are 2 different kind of zips in the release section for every class

  1. Book wise NCERT_CLASS_ClassNo_Subject_BookName.zip : These zips contain the Chapters of the BookName for the Subject of the ClassNo
  2. Books Text Class_ClassNo_Text.zip : These zips contain the text extracted from all the books of the ClassNo

How to use the spider

Initial Setup

git clone https://github.com/nit-in/download_ncert_books.git
cd download_ncert_books
pip install -r requirements.txt

to run the spider

scrapy crawl --nolog ncert

and follow the prompts

for example if you want to download Class 11th Economics Book

 scrapy crawl  --nolog ncert                                                                                                                                      ─╯

Enter the class:        11

Select one the subjects:
Enter 1 for Sanskrit
Enter 2 for Accountancy
Enter 3 for Chemistry
Enter 4 for Mathematics
Enter 5 for Economics
Enter 6 for Psychology
Enter 7 for Geography

and so on ...

Enter subject number:   5

Select one the books:
Enter 1 for Indian Economic Development
Enter 2 for Statistics for Economics
Enter 3 for Sankhyiki
Enter 4 for Bhartiya Airthryavstha Ka Vikas 
Enter 5 for Hindustan Ki Moaashi Tarraqqi(Urdu)
Enter 6 for Shumariyaat Bar-e-Mushiyat(Urdu)

Enter book number:      1

Downloading...  Class: Class11  Subject: Economics      Book: Indian_Economic_Development       Chapters: 10


downloading keec1ps.pdf to  /home/user/ncert/Class11/Economics/Indian_Economic_Development/keec1ps.pdf
downloading keec101.pdf to  /home/user/ncert/Class11/Economics/Indian_Economic_Development/keec101.pdf
downloading keec102.pdf to  /home/user/ncert/Class11/Economics/Indian_Economic_Development/keec102.pdf

			OR 

to download multiple books

enter their numbers separated by commas

e.g. 

Select one the books:
Enter 1 for Indian Economic Development
Enter 2 for Statistics for Economics
Enter 3 for Sankhyiki
Enter 4 for Bhartiya Airthryavstha Ka Vikas 
Enter 5 for Hindustan Ki Moaashi Tarraqqi(Urdu)
Enter 6 for Shumariyaat Bar-e-Mushiyat(Urdu)

Enter book number:      1,2

if you want to see scrapy spider log

scrapy shell ncert
You might also like...
Snowflake database loading utility with Scrapy integration

Snowflake Stage Exporter Snowflake database loading utility with Scrapy integration. Meant for streaming ingestion of JSON serializable objects into S

Scraping news from Ucsal portal with Scrapy.

NewsScraping Esse é um projeto de raspagem das últimas noticias, de 2021, do portal da universidade Ucsal http://noosfero.ucsal.br/institucional Tecno

a Scrapy spider that utilizes Postgres as a DB, Squid as a proxy server, Redis for de-duplication and Splash to render JavaScript. All in a microservices architecture utilizing Docker and Docker Compose

This is George's Scraping Project To get started cd into the theZoo file and run: chmod +x script.sh then: ./script.sh This will spin up a Postgres co

Fundamentus scrapy

Fundamentus_scrapy Baixa informacões que os outros scrapys do fundamentus não realizam. Para iniciar (python main.py), sera criado um arquivo chamado

Crawler do site Fundamentus.com com o uso do framework scrapy, tanto da aba detalhada como a de resumo.

Crawler do site Fundamentus.com com o uso do framework scrapy, tanto da aba detalhada como a de resumo. (Todas as infomações)

Scrapy-based cyber security news finder

Cyber-Security-News-Scraper Scrapy-based cyber security news finder Goal To keep up to date on the constant barrage of information within the field of

Scrapy uses Request and Response objects for crawling web sites.

Requests and Responses¶ Scrapy uses Request and Response objects for crawling web sites. Typically, Request objects are generated in the spiders and p

Bigdata - This Scrapy project uses Redis and Kafka to create a distributed on demand scraping cluster

Scrapy Cluster This Scrapy project uses Redis and Kafka to create a distributed

Iptvcrawl - A scrapy project for crawl IPTV playlist

iptvcrawl a scrapy project for crawl IPTV playlist. Dependency Python3 pip insta

Comments
  • Bump requests from 2.26.0 to 2.28.1

    Bump requests from 2.26.0 to 2.28.1

    Bumps requests from 2.26.0 to 2.28.1.

    Release notes

    Sourced from requests's releases.

    v2.28.1

    2.28.1 (2022-06-29)

    Improvements

    • Speed optimization in iter_content with transition to yield from. (#6170)

    Dependencies

    • Added support for chardet 5.0.0 (#6179)
    • Added support for charset-normalizer 2.1.0 (#6169)

    New Contributors

    Full Changelog: https://github.com/psf/requests/blob/main/HISTORY.md#2281-2022-06-29

    v2.28.0

    2.28.0 (2022-06-09)

    Deprecations

    • ⚠️ Requests has officially dropped support for Python 2.7. ⚠️ (#6091)
    • Requests has officially dropped support for Python 3.6 (including pypy3). (#6091)

    Improvements

    • Wrap JSON parsing issues in Request's JSONDecodeError for payloads without an encoding to make json() API consistent. (#6097)
    • Parse header components consistently, raising an InvalidHeader error in all invalid cases. (#6154)
    • Added provisional 3.11 support with current beta build. (#6155)
    • Requests got a makeover and we decided to paint it black. (#6095)

    Bugfixes

    • Fixed bug where setting CURL_CA_BUNDLE to an empty string would disable cert verification. All Requests 2.x versions before 2.28.0 are affected. (#6074)
    • Fixed urllib3 exception leak, wrapping urllib3.exceptions.SSLError with requests.exceptions.SSLError for content and iter_content. (#6057)
    • Fixed issue where invalid Windows registry entires caused proxy resolution to raise an exception rather than ignoring the entry. (#6149)
    • Fixed issue where entire payload could be included in the error message for JSONDecodeError. (#6079)

    New Contributors

    ... (truncated)

    Changelog

    Sourced from requests's changelog.

    2.28.1 (2022-06-29)

    Improvements

    • Speed optimization in iter_content with transition to yield from. (#6170)

    Dependencies

    • Added support for chardet 5.0.0 (#6179)
    • Added support for charset-normalizer 2.1.0 (#6169)

    2.28.0 (2022-06-09)

    Deprecations

    • ⚠️ Requests has officially dropped support for Python 2.7. ⚠️ (#6091)
    • Requests has officially dropped support for Python 3.6 (including pypy3.6). (#6091)

    Improvements

    • Wrap JSON parsing issues in Request's JSONDecodeError for payloads without an encoding to make json() API consistent. (#6097)
    • Parse header components consistently, raising an InvalidHeader error in all invalid cases. (#6154)
    • Added provisional 3.11 support with current beta build. (#6155)
    • Requests got a makeover and we decided to paint it black. (#6095)

    Bugfixes

    • Fixed bug where setting CURL_CA_BUNDLE to an empty string would disable cert verification. All Requests 2.x versions before 2.28.0 are affected. (#6074)
    • Fixed urllib3 exception leak, wrapping urllib3.exceptions.SSLError with requests.exceptions.SSLError for content and iter_content. (#6057)
    • Fixed issue where invalid Windows registry entires caused proxy resolution to raise an exception rather than ignoring the entry. (#6149)
    • Fixed issue where entire payload could be included in the error message for JSONDecodeError. (#6036)

    2.27.1 (2022-01-05)

    Bugfixes

    • Fixed parsing issue that resulted in the auth component being dropped from proxy URLs. (#6028)

    2.27.0 (2022-01-03)

    ... (truncated)

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    dependencies 
    opened by dependabot[bot] 0
  • Bump itemadapter from 0.4.0 to 0.7.0

    Bump itemadapter from 0.4.0 to 0.7.0

    Bumps itemadapter from 0.4.0 to 0.7.0.

    Release notes

    Sourced from itemadapter's releases.

    v0.7.0

    What's Changed

    New Contributors

    Full Changelog: https://github.com/scrapy/itemadapter/compare/v0.6.0...v0.7.0

    v0.6.0

    What's Changed

    Full Changelog: https://github.com/scrapy/itemadapter/compare/v0.5.0...v0.6.0

    v0.5.0

    What's Changed

    Full Changelog: https://github.com/scrapy/itemadapter/compare/v0.4.0...v0.5.0

    Changelog

    Sourced from itemadapter's changelog.

    0.7.0 (2022-08-02)

    ItemAdapter.get_field_names_from_class (#64)

    0.6.0 (2022-05-12)

    Slight performance improvement (#62)

    0.5.0 (2022-03-18)

    Improve performance by removing imports inside functions (#60)

    Commits
    • 0bd037c Bump version: 0.6.0 → 0.7.0
    • 8f3826a Update changelog for 0.7.0
    • 900ae14 ItemAdapter.get_field_names_from_class (#64)
    • 927ee25 Bump version: 0.5.0 → 0.6.0
    • 86f82ea Update changelog for 0.6.0
    • 8f239bc Merge pull request #62 from scrapy/performance
    • 60c9ccc Merge pull request #61 from scrapy/fix-repr
    • 8733014 Replace 'any' ocurrences
    • d66aa62 Remove hardcoded class name in ItemAdapter.repr
    • 1203b5e Bump version: 0.4.0 → 0.5.0
    • Additional commits viewable in compare view

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    dependencies 
    opened by dependabot[bot] 0
  • Bump scrapy from 2.5.0 to 2.7.1

    Bump scrapy from 2.5.0 to 2.7.1

    Bumps scrapy from 2.5.0 to 2.7.1.

    Release notes

    Sourced from scrapy's releases.

    2.7.1

    • Relaxed the restriction introduced in 2.6.2 so that the Proxy-Authentication header can again be set explicitly in certain cases, restoring compatibility with scrapy-zyte-smartproxy 2.1.0 and older
    • Bug fixes

    See the full changelog

    2.7.0

    See the full changelog

    2.6.3

    Makes pip install Scrapy work again.

    It required making changes to support pyOpenSSL 22.1.0. We had to drop support for SSLv3 as a result.

    We also upgraded the minimum versions of some dependencies.

    See the changelog.

    2.6.2

    Fixes a security issue around HTTP proxy usage, and addresses a few regressions introduced in Scrapy 2.6.0.

    See the changelog.

    2.6.1

    Fixes a regression introduced in 2.6.0 that would unset the request method when following redirects.

    2.6.0

    • Security fixes for cookie handling (see details below)
    • Python 3.10 support
    • asyncio support is no longer considered experimental, and works out-of-the-box on Windows regardless of your Python version
    • Feed exports now support pathlib.Path output paths and per-feed item filtering and post-processing

    See the full changelog

    Security bug fixes

    • When a Request object with cookies defined gets a redirect response causing a new Request object to be scheduled, the cookies defined in the original Request object are no longer copied into the new Request object.

      If you manually set the Cookie header on a Request object and the domain name of the redirect URL is not an exact match for the domain of the URL of the original Request object, your Cookie header is now dropped from the new Request object.

      The old behavior could be exploited by an attacker to gain access to your cookies. Please, see the cjvr-mfj7-j4j8 security advisory for more information.

    ... (truncated)

    Changelog

    Sourced from scrapy's changelog.

    Scrapy 2.7.1 (2022-11-02)

    New features

    
    -   Relaxed the restriction introduced in 2.6.2 so that the
        ``Proxy-Authentication`` header can again be set explicitly, as long as the
        proxy URL in the :reqmeta:`proxy` metadata has no other credentials, and
        for as long as that proxy URL remains the same; this restores compatibility
        with scrapy-zyte-smartproxy 2.1.0 and older (:issue:`5626`).
    

    Bug fixes

    
    -   Using ``-O``/``--overwrite-output`` and ``-t``/``--output-format`` options
        together now produces an error instead of ignoring the former option
        (:issue:`5516`, :issue:`5605`).
    
    • Replaced deprecated :mod:asyncio APIs that implicitly use the current event loop with code that explicitly requests a loop from the event loop policy (:issue:5685, :issue:5689).

    • Fixed uses of deprecated Scrapy APIs in Scrapy itself (:issue:5588, :issue:5589).

    • Fixed uses of a deprecated Pillow API (:issue:5684, :issue:5692).

    • Improved code that checks if generators return values, so that it no longer fails on decorated methods and partial methods (:issue:5323, :issue:5592, :issue:5599, :issue:5691).

    Documentation </code></pre> <ul> <li> <p>Upgraded the Code of Conduct to Contributor Covenant v2.1 (:issue:<code>5698</code>).</p> </li> <li> <p>Fixed typos (:issue:<code>5681</code>, :issue:<code>5694</code>).</p> </li> </ul> <p>Quality assurance</p> <pre><code>

    • Re-enabled some erroneously disabled flake8 checks (:issue:5688).

    • Ignored harmless deprecation warnings from :mod:typing in tests (:issue:5686, :issue:5697).

    • Modernized our CI configuration (:issue:5695, :issue:5696).

    &lt;/tr&gt;&lt;/table&gt; </code></pre> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary>

    <ul> <li><a href="https://github.com/scrapy/scrapy/commit/6ded3cf4cd134b615239babe28bb28c3ff524b05"><code>6ded3cf</code></a> Bump version: 2.7.0 → 2.7.1</li> <li><a href="https://github.com/scrapy/scrapy/commit/95880c5de1b1909bf03303fb9c02cddb0508fe1a"><code>95880c5</code></a> Merge pull request <a href="https://github-redirect.dependabot.com/scrapy/scrapy/issues/5701">#5701</a> from scrapy/relnotes-2.7.1</li> <li><a href="https://github.com/scrapy/scrapy/commit/5ec175b8bb08f93c431d7d64d2389b90ec7a1f37"><code>5ec175b</code></a> Small relnotes fixes.</li> <li><a href="https://github.com/scrapy/scrapy/commit/940a73863bf7dcb16b3f2d9f5efb83efe4599712"><code>940a738</code></a> Release notes for 2.7.1.</li> <li><a href="https://github.com/scrapy/scrapy/commit/a95a338eeada7275a5289cf036136610ebaf07eb"><code>a95a338</code></a> Merge pull request <a href="https://github-redirect.dependabot.com/scrapy/scrapy/issues/5599">#5599</a> from tonal/patch-1</li> <li><a href="https://github.com/scrapy/scrapy/commit/9077d0f9b490114f117c668f115240c16afccedf"><code>9077d0f</code></a> Merge pull request <a href="https://github-redirect.dependabot.com/scrapy/scrapy/issues/5698">#5698</a> from pankali/patch-1</li> <li><a href="https://github.com/scrapy/scrapy/commit/76c2cb070e4efe3ae33a4b3d72a5bcac6709f48f"><code>76c2cb0</code></a> Merge pull request <a href="https://github-redirect.dependabot.com/scrapy/scrapy/issues/5697">#5697</a> from iamkaushal/<a href="https://github-redirect.dependabot.com/scrapy/scrapy/issues/5686">#5686</a>_fix</li> <li><a href="https://github.com/scrapy/scrapy/commit/9f45be439de8a3b9a6d201c33e98b408a73c02bb"><code>9f45be4</code></a> Update Code of Conduct to Contributor Covenant v2.1</li> <li><a href="https://github.com/scrapy/scrapy/commit/bd9e482c2f0db92065708c8291be6e8bc1f05218"><code>bd9e482</code></a> added typing.io and typing.re in pytest warning filter to ignore</li> <li><a href="https://github.com/scrapy/scrapy/commit/fd692f309105d917f5f46bd00a88c550d6cc7da3"><code>fd692f3</code></a> Prevent running the -O and -t command-line options together (<a href="https://github-redirect.dependabot.com/scrapy/scrapy/issues/5605">#5605</a>)</li> <li>Additional commits viewable in <a href="https://github.com/scrapy/scrapy/compare/2.5.0...2.7.1">compare view</a></li> </ul> </details>

    <br />

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 0
Releases(class_9)
Owner
coding is a hobby; Not professionally educated in programming; If you find issues or mistake DO tell me ;-)
Scraping web pages to get data

Scraping Data Get public data and save in database This is project use Python How to run a project 1 - Clone the repository 2 - Install beautifulsoup4

Soccer Project 2 Nov 01, 2021
This is python to scrape overview and reviews of companies from Glassdoor.

Data Scraping for Glassdoor This is python to scrape overview and reviews of companies from Glassdoor. Please use it carefully and follow the Terms of

Houping 5 Jun 23, 2022
API to parse tibia.com content into python objects.

Tibia.py An API to parse Tibia.com content into object oriented data. No fetching is done by this module, you must provide the html content. Features:

Allan Galarza 25 Oct 31, 2022
This Spider/Bot is developed using Python and based on Scrapy Framework to Fetch some items information from Amazon

- Hello, This Project Contains Amazon Web-bot. - I've developed this bot for fething some items information on Amazon. - Scrapy Framework in Python is

Khaled Tofailieh 4 Feb 13, 2022
Snowflake database loading utility with Scrapy integration

Snowflake Stage Exporter Snowflake database loading utility with Scrapy integration. Meant for streaming ingestion of JSON serializable objects into S

Oleg T. 0 Dec 06, 2021
A Pixiv web crawler module

Pixiv-spider A Pixiv spider module WARNING It's an unfinished work, browsing the code carefully before using it. Features 0004 - Readme.md updated, co

Uzuki 1 Nov 14, 2021
Scrapegoat is a python library that can be used to scrape the websites from internet based on the relevance of the given topic irrespective of language using Natural Language Processing

Scrapegoat is a python library that can be used to scrape the websites from internet based on the relevance of the given topic irrespective of language using Natural Language Processing. It can be ma

10 Jul 06, 2022
Lovely Scrapper

Lovely Scrapper

Tushar Gadhe 2 Jan 01, 2022
Meme-videos - Scrapes memes and turn them into a video compilations

Meme Videos Scrapes memes from reddit using praw and request and then converts t

Partho 12 Oct 28, 2022
Scrap-mtg-top-8 - A top 8 mtg scraper using python

Scrap-mtg-top-8 - A top 8 mtg scraper using python

1 Jan 24, 2022
Scraping script for stats on covid19 pandemic status in Chiba prefecture, Japan

About 千葉県の地域別の詳細感染者統計(Excelファイル) をCSVに変換し、かつ地域別の日時感染者集計値を出力するスクリプトです。 Requirement POSIX互換なシェル, e.g. GNU Bash (1) curl (1) python = 3.8 pandas = 1.1.

Conv4Japan 1 Nov 29, 2021
feapder 是一款简单、快速、轻量级的爬虫框架。以开发快速、抓取快速、使用简单、功能强大为宗旨。支持分布式爬虫、批次爬虫、多模板爬虫,以及完善的爬虫报警机制。

feapder 是一款简单、快速、轻量级的爬虫框架。起名源于 fast、easy、air、pro、spider的缩写,以开发快速、抓取快速、使用简单、功能强大为宗旨,历时4年倾心打造。支持轻量爬虫、分布式爬虫、批次爬虫、爬虫集成,以及完善的爬虫报警机制。 之

boris 1.4k Dec 29, 2022
Docker containerized Python Flask API that uses selenium to scrape and interact with websites

Docker containerized Python Flask API that uses selenium to scrape and interact with websites

Christian Gracia 0 Jan 22, 2022
Twitter Eye is a Twitter Information Gathering Tool With Twitter Eye

Twitter Eye is a Twitter Information Gathering Tool With Twitter Eye, you can search with various keywords and usernames on Twitter.

Jolanda de Koff 19 Dec 12, 2022
WebScrapping Project - G1 Latest News

Web Scrapping com Python Esse projeto consiste em um código para o usuário buscar as últimas nóticias sobre um termo qualquer, no site G1. Para esse p

Eduardo Henrique 2 Feb 13, 2022
此脚本为 python 脚本,实现原理为利用 selenium 定位相关元素,再配合点击事件完成浏览器的自动化.

此脚本为 python 脚本,实现原理为利用 selenium 定位相关元素,再配合点击事件完成浏览器的自动化.

N0el4kLs 5 Nov 19, 2021
A repository with scraping code and soccer dataset from understat.com.

UNDERSTAT - SHOTS DATASET As many people interested in soccer analytics know, Understat is an amazing source of information. They provide Expected Goa

douglasbc 48 Jan 03, 2023
Python Web Scrapper Project

Web Scrapper Projeto desenvolvido em python, sobre tudo com Selenium, BeautifulSoup e Pandas é um web scrapper que puxa uma tabela com as principais e

Jordan Ítalo Amaral 2 Jan 04, 2022
A training task for web scraping using python multithreading and a real-time-updated list of available proxy servers.

Parallel web scraping The project is a training task for web scraping using python multithreading and a real-time-updated list of available proxy serv

Kushal Shingote 1 Feb 10, 2022
Python framework to scrape Pastebin pastes and analyze them

pastepwn - Paste-Scraping Python Framework Pastebin is a very helpful tool to store or rather share ascii encoded data online. In the world of OSINT,

Rico 105 Dec 29, 2022