We want to check several batch of web URLs (1~100 K) and find the phishing website/URL among them.

Overview

WebAttestation

[TOC]

Introduction

We want to check several batch of web URLs (1~100 K) and find the phishing website/URL among them. This module is designed to do the URL/web attestation by using the API from NUS-Phishperida-Project. The program contents 3 main parts: WebDownloader, webScreenShoter and PhishperidaPKG.

WebDownloader

This module will provide API to download the webpage component: html file, image file, javascript file, href link file based on the input URL.

Module detail doc : https://github.com/LiuYuancheng/WebAttestation/blob/main/WebDownloadReadme.md

WebScreenShoter

This module will use different web browser's driver to capture the webpage's screen shot based on the given URL.

Module detail doc :

PhishperidaPKG

This module is used to encapsulate the NUS-Phishperida project (not OOP) as a black box API for other projects to use.

NUS-Phishperida project: https://github.com/lindsey98/Phishpedia

Module detail doc :

For each URL, the program will do below steps:

  1. Use webDownloader module to download all the web components.1

  2. Use webScreenShoter module to get a webpage screenshot of the url.

  3. Pass the web components and the screen shot to PhishperidaPKG to do the siamese checking

Program Workflow

If you set the program running under single thread, the program work flow diagram will be shown as below:


Program Setup

Development Environment : python 3.7.10
Additional Lib/Software Need
  • WebDownloader: Refer to program setup section in [WebDownloaderReadme.md]
  • WebScreenShoter: Refer to program setup section in [WebScreenShoterReadme.md]
  • PhishperidaPKG: Refer to program setup section in [PhishperidaPKGReadme.md]
Hardware Needed
  • WebDownloader: N.A
  • WebScreenShoter: Computer with video output.
  • PhishperidaPKG: Computer with Nvidia graph card.
Program File List

version: v0.1

Program File Execution Env Description
src/webAttestation.py python 3.7.4 Main web Attestation execution program.
src/webScreenShoter.py python 3.7.10 Main web screen shot execution program.
src/webDownload.py python 3.7.10 Main web Downloader program API.
src/phishpediaPKG.py python 3.8.10 Encapsulated API the NUS-Phishperida project for OPP.
src/webGlobal.py python 3.7.4 Global parameters file which will be used in the other modules.
src/urllist.txt url record list.

Program Usage

Module API Usage
  • WebDownloader: Refer to program API usage section in [WebDownloaderReadme.md]
  • WebScreenShoter: Refer to program API usage section in [WebScreenShoterReadme.md]
  • PhishperidaPKG: Refer to program API usage section in [PhishperidaPKGReadme.md]
Program Execution
  1. Copy the url you want to check in the url record file "urllist.txt"

  2. Cd to the program folder and run program execution cmd:

    python webAttestation.py
    
  3. Check the result:


Last edit by LiuYuancheng([email protected]) at 26/11/2021

Desenvolvendo as habilidades básicas de programação visando a construção de aplicativos por meio de bibliotecas apropriadas à Ciência de Dados.

Algoritmos e Introdução à Computação Ementa: Conceitos básicos sobre algoritmos e métodos para sua construção. Tipos de dados e variáveis. Estruturas

Dyanna Cruz 1 Jan 06, 2022
Python project setup, updater, and launcher

pyLaunch Python project setup, updater, and launcher Purpose: Increase project productivity and provide features easily. Once installed as a git submo

DAAV, LLC 1 Jan 07, 2022
Exactly what it sounds like, which is something rad

EyeWitnessTheFitness External recon got ya down? That scan prevention system preventing you from enumerating web pages? Well look no further, I have t

Ellis Springe 18 Dec 31, 2022
A Python application that simulates the rolling of a dice, randomly picking one of the 6 faces and then displaying it.

dice-roller-app This is an application developed in Python that shuffles between the 6 faces of a dice, using buttons to shuffle and close the applica

Paddy Costelloe 0 Jul 20, 2021
This is Gaurav's IP Project Completed in the year session of 2021-2022.

The Analyser by Gaurav Rayat Why this Project? Today we are continuously hearing about growth in Crime rates and the number of murders executed day by

1 Dec 30, 2021
Script to change official Kali repository to mirrors

Script to change official Kali repository to mirrors. This helps increase packages update and downloading for some user.

Vineet Bhavsar 2 Nov 29, 2021
Cairo-bloom - A naive bloom filter implementation in Cairo

🥀 cairo-bloom A naive bloom filter implementation in Cairo. A Bloom filter is a

Sam Barnes 37 Oct 01, 2022
Telegram bot to remove the forwarded tag from messages.

Anonymous Sender Bot @AnonySendBot Telegram bot to remove the forwarded tag from messages. Table of Contents Usage Deploy To Heroku Local Deploying En

Stark Bots 26 Nov 24, 2022
Radiosonde Telemetry Decoders

Radiosonde Telemetry Frame Decoders This repository is an attempt to collate the various sources of information on how to decode radiosonde telemetry

Project Horus 3 Jan 04, 2022
Dotfiles for my configurations!

Dotfiles Repo Welcome! Over here, you can find my dotfiles for various applications, including QTile, Alacritty, Kitty, LunarVim, and more! Make sure

class PythonAddict 3 Jan 10, 2022
SimCSE在中文任务上的简单实验

SimCSE 中文测试 SimCSE在常见中文数据集上的测试,包含ATEC、BQ、LCQMC、PAWSX、STS-B共5个任务。 介绍 博客:https://kexue.fm/archives/8348 论文:《SimCSE: Simple Contrastive Learning of Sente

苏剑林(Jianlin Su) 504 Jan 04, 2023
importlib_resources is a backport of Python standard library importlib.resources module for older Pythons.

importlib_resources is a backport of Python standard library importlib.resources module for older Pythons. The key goal of this module is to replace p

Python 36 Dec 13, 2022
This is a practice on Airflow, which is building virtual env, installing Airflow and constructing data pipeline (DAGs)

airflow-test This is a practice on Airflow, which is Builing virtualbox env and setting Airflow on that env Installing Airflow using python virtual en

Jaeyoung 1 Nov 01, 2021
A simple weather app.

keather A simple weather app. This is currently not finished. Dependencies: yay -S python-beautifulsoup4 tk

1 Jan 09, 2022
The code for 2021 MGTV AI Challenge Anti Stealing Link, and the online result ranks 10th.

赛题介绍 芒果TV-第二届“马栏山杯”国际音视频算法大赛-防盗链 随着业务的发展,芒果的视频内容也深受网友的喜欢,不少视频网站和应用开始盗播芒果的视频内容,盗链网站不经过芒果TV的前端系统,跳过广告播放,且消耗大量的服务器、带宽资源,直接给公司带来了巨大的经济损失,因此防盗链在日常运营中显得尤为重要

tongji40 16 Jun 17, 2022
Earth-to-orbit ballistic trajectories with atmospheric resistance

Earth-to-orbit ballistic trajectories with atmospheric resistance Overview Space guns are a theoretical technology that reduces the cost of getting bu

1 Dec 03, 2021
Extract gene length based on featureCount calculation gene nonredundant exon length method.

Extract gene length based on featureCount calculation gene nonredundant exon length method.

laojunjun 12 Nov 21, 2022
Um pequeno painel de consulta

Spynel Um pequeno painel com consultas de: IP CEP PLACA CNPJ OBS: caso execute o script pelo termux, recomendo que use o da F-Droid por ser mais atual

Spyware 12 Oct 25, 2022
Djangoblog - A blogging site where people can make their accout and write blogs and read other author's blogs

This a blogging site where people can make their accout and write blogs and read other author's blogs.

1 Jan 26, 2022
Glyph Metadata Palette

This plugin for Glyphs3 allows you to associate arbitrary structured metadata to each glyph in your font.

Simon Cozens 4 Jan 26, 2022