Automatically detect changes made to the official Telegram sites.

Overview

🕷 Telegram Web Crawler

This project is developed to automatically detect changes made to the official Telegram sites. This is necessary for anticipating future updates and other things (new vacancies, API updates, etc).

Name Commits Status
Site updates tracker Commits Fetch new content of tracked links to files
Site links tracker Commits Generate or update list of tracked links
  • passing – new changes
  • failing – no changes

You should to subscribe to channel with alerts to stay updated. Copy of Telegram websites stored here.

GitHub pretty diff

How it works

  1. Link crawling runs as often as possible. Starts crawling from the home page of the site. Detects relative and absolute sub links and recursively repeats the operation. Writes a list of unique links for future content comparison. Additionally, there is the ability to add links by hand to help the script find more hidden (links to which no one refers) links. To manage exceptions, there is a system of rules for the link crawler.

  2. Content crawling is launched as often as possible and uses the existing list of links collected in step 1. Going through the base it gets contains and builds a system of subfolders and files. Removes all dynamic content from files.

  3. Using of GitHub Actions. Works without own servers. You can just fork this repository and own tracker system by yourself. Workflows launch scripts and commit changes. All file changes are tracked by the GIT and beautifully displayed on the GitHub. GitHub Actions should be built correctly only if there are changes on the Telegram website. Otherwise, the workflow should fail. If build was successful, we can send notifications to Telegram channel and so on.

FAQ

Q: How often is "as often as possible"?

A: TLTR: content update action runs every ~10 minutes. More info:

Q: Why there is 2 separated crawl scripts instead of one?

A: Because the previous idea was to update tracked links once at hour. It was so comfortably to use separated scripts and workflows. After Telegram 7.7 update, I realised that find new blog posts so slowly is bad idea.

Q: Why alert for sending alerts have while loop?

A: Because GitHub API doesn't return information about commit immediately after push to repository. Therefore, script are waiting for information to appear...

Q: Why are you using GitHab Personal Access Token in action/checkout workflow`s step?

A: To have ability to trigger other workflows by on push trigger. More info:

Q: Why are you using GitHab PAT in make_and_send_alert.py?

A: To increase limits of GitHub API.

TODO list

  • add storing history of content using hashes;
  • add storing hashes of image, svg, video.

Example of link crawler rules configuration

CRAWL_RULES = {
    # every rule is regex
    # empty string means match any url
    # allow rules with higher priority than deny
    'translations.telegram.org': {
        'allow': {
            r'^[^/]*$',  # root
            r'org/[^/]*/$',  # 1 lvl sub
            r'/en/[a-z_]+/$'  # 1 lvl after /en/
        },
        'deny': {
            '',  # all
        }
    },
    'bugs.telegram.org': {
        'deny': {
            '',    # deny all sub domain
        },
    },
}

Current hidden urls list

HIDDEN_URLS = {
    # 'corefork.telegram.org', # disabled

    'telegram.org/privacy/gmailbot',
    'telegram.org/tos',
    'telegram.org/tour',
    'telegram.org/evolution',

    'desktop.telegram.org/changelog',
}

License

Licensed under the MIT License.

Owner
Il'ya
Telegram: https://t.me/MarshalX
Il'ya
Discord Bot for Genshin Impact Wish Simulating

Genshin Inpact Wish Simulation Discord Bot Bot Links Invite Reddit Official Discord Features Discord embed reaction menu for wishes Simple code scalin

Jeffrey Shum 2 Jan 04, 2023
Discord bot to administer IITD Study Servers (unofficial)

IITD-Bot Discord bot to administer IITD'20 Acad Server Commands hello to check if bot is online ?help to display this message ?set kerberos to set y

Aditya Singh 47 Dec 19, 2022
Access Undenied parses AWS AccessDenied CloudTrail events, explains the reasons for them, and offers actionable remediation steps. Open-sourced by Ermetic.

Access Undenied on AWS Access Undenied parses AWS AccessDenied CloudTrail events, explains the reasons for them, and offers actionable fixes. Access U

Ermetic 204 Jan 02, 2023
Python script to delete old / embarrassing tweets.

Delete Tweets Do you have hundreds of embarrassing tweets on your Twitter profile, that you tweeted over a decade ago as an innocent high schooler, th

Linda Zheng 9 Nov 26, 2022
A Discord API Wrapper for Userbots/Selfbots written in Python.

DisCum A simple, easy to use, non-restrictive, synchronous Discord API Wrapper for Selfbots/Userbots written in Python. -using requests and websockets

Liam 450 Dec 27, 2022
An API wrapper library for opensea api.

Opensea API An API wrapper library for opensea api. Installation pip3 install opensea Usage Retrieving assets: from opensea import get_assets # This

Ankush Singh 38 Jul 17, 2022
Advanced Number Validator Using telnyx api

Number Validator Python v1.0.0 Number Validator Using telnyx api DISCLAIMER This Tool is only for educational purposes You'll be responsible yourself

xBlackxCoder 3 Sep 24, 2022
Seth Userbot with python

SETH-USERBOT DEPLOY TO HEROKU Group Support: String Session : Stay Support 🚀 ❁ LonamiWebs and Telethon © Credits ⚡ THANK YOU VERY MUCH FOR zeinzo Zei

seth 4 Jan 10, 2022
Discord Token Checker and Info

Discord Token Checker A simple way to check Discord user tokens and their info in bulk. By Roover#7098. https://discord.gg/W8hnMWY6XP Proxy support co

Roover 3 Dec 09, 2021
Fast IP address lookup

ipscoop Fast IP Scoop Table of Contents Installation CLI Getting Started Ref Installation To install ipscoop, simply: $ python3 -m pip install -U git+

6 Mar 16, 2022
A module to get data about anime characters, news, info, lyrics and more.

Animec A module to get data about anime characters, news, info, lyrics and more. The module scrapes myanimelist to parse requested data. If you wish t

DriftAsimov 31 Aug 31, 2022
Bringing Ethereum Virtual Machine to StarkNet at warp speed!

Warp Warp brings EVM compatible languages to StarkNet, making it possible to transpile Ethereum smart contracts to Cairo, and use them on StarkNet. Ta

Nethermind 700 Dec 26, 2022
Python 3 tools for interacting with Notion API

NotionDB Python 3 tools for interacting with Notion API: API client Relational database wrapper Installation pip install notiondb API client from noti

Viet Hoang 14 Nov 24, 2022
Automatic Video Library Manager for TV Shows

Automatic Video Library Manager for TV Shows. It watches for new episodes of your favorite shows, and when they are posted it does its magic. Dependen

1.5k Dec 22, 2022
Automatically mass follows tons of NameMC profiles.

Automatically mass follows tons of NameMC profiles. (Creates REAL traffic to your profile)

Jam 3 Jun 29, 2022
Read manga from your favourites websites on telegram.

tg-manga-bot Read manga from your favourites websites on telegram. Current Development Bot @idkpythonbot Telegram Channel tg_manga_bot Commands start

Daniel Rivero 41 Dec 22, 2022
The Simple Google Colab Notebook to Download Files from Direct Link to Google Drive with custom name and bulk link support.

Direct Link to Google Drive (Advanced! 🔥 ) The Most Advanced yet Simple Google Colab Notebook to Download Files from Direct Link to Google Drive. 🆕

Dr.Caduceus 14 Jul 26, 2022
This is a starter template of discord.py project

Template Discord.py This is a starter template of discord.py project (Supports Slash commands!). 👀 Getting Started First, you need to install Python

1 Dec 22, 2021
A Telegram robot can clone medias from any chat to your own chat.

Clonebot A Telegram robot can clone medias from any chat to your own chat. Read the documentation to know how to use the bot Deploy Developer Document

Renjith Mangal 224 Dec 30, 2022
Yet another discord-BOT

Note I have not added comments to the initial code as it is for my educational purpose. Use This is the code for a discord-BOT API py-cord-2.0.0a4178+

IRONMELTS 1 Dec 18, 2021