Audio media crawler for lbry.

Last update: Dec 03, 2022

Related tags

Overview

Audio media crawler for lbry.

Requirements

Python 3.8
Poetry 1.1.7
Elasticsearch 7.14.0
Lbry-sdk 0.99.0

Development

This project uses poetry as a dependency management tool.

Install dependencies

Installs all defined dependencies of the project. For more information please read the poetry documentation.

poetry install

Tasks

Update hooks

Setup and update pre-commit hooks. You should run this the first time after poetry install.

poetry run task update-hooks

Format code

For more information please read the black documentation

poetry run task format

Commands

Basic usage

For more information please read the poetry documentation.

poetry run podcatcher <command>

Sync

Scan all audio streams to find music and podcasts episodes, keeping elasticsearch in sync.

poetry run podcatcher sync

Retry sync

Retry failed sync from last checkpoint. If no previous failed sync occured it will just run a normal sync.

poetry run podcatcher retry-sync

Cache sync

Skip scan and sync existent cache data to elasticsearch.

poetry run podcatcher cache-sync

Clear cache

Remove all files on the cache directory.

poetry run podcatcher clear-cache

Drop

Remove all indices from elasticsearch and all files from the cache directory.

poetry run podcatcher drop

Audio media crawler for lbry.

Related tags

Overview

Audio media crawler for lbry.

Requirements

Development

Install dependencies

Tasks

Update hooks

Format code

Commands

Basic usage

Sync

Retry sync

Cache sync

Clear cache

Drop

Owner

Hound.fm

用python爬取江苏几大高校的就业网站，并提供3种方式通知给用户，分别是通过微信发送、命令行直接输出、windows气泡通知。

京东茅台抢购 2021年4月最新版

News, full-text, and article metadata extraction in Python 3. Advanced docs:

抖音批量下载用户所有无水印视频

Open Crawl Vietnamese Text

An application that on a given url, crowls a web page and gets all words, sorts and counts them.

A Python Covid-19 cases tracker that scrapes data off the web and presents the number of Cases, Recovered Cases, and Deaths that occurred because of the pandemic.

Universal Reddit Scraper - A comprehensive Reddit scraping command-line tool written in Python.

Get paper names from dblp.org

Fundamentus scrapy

feapder 是一款简单、快速、轻量级的爬虫框架。以开发快速、抓取快速、使用简单、功能强大为宗旨。支持分布式爬虫、批次爬虫、多模板爬虫，以及完善的爬虫报警机制。

An Automated udemy coupons scraper which scrapes coupons and autopost the result in blogspot post

An helper library to scrape data from Instagram effortlessly, using the Influencer Hunters APIs.

A database scraper created with mechanical soup and sqlite

12306抢票脚本

优化版本的京东茅台抢购神器

Docker containerized Python Flask API that uses selenium to scrape and interact with websites

腾讯课堂，模拟登陆，获取课程信息，视频下载，视频解密。

Bigdata - This Scrapy project uses Redis and Kafka to create a distributed on demand scraping cluster

Telegram Group Scrapper