A way to analyse how malware and/or goodware samples vary from each other using Shannon Entropy, Hausdorff Distance and Jaro-Winkler Distance

Overview

A way to analyse how malware and/or goodware samples vary from each other using Shannon Entropy, Hausdorff Distance and Jaro-Winkler Distance

Python version Project version Codacy Grade


UsageDownload

Introduction

ByteCog is a python script that aims to help security researchers and others a like to classify malicious software compared to other samples, depending on what the unknown file(s) is/are being tested against. This script can be extended to use a machine learning model to classify malware if you wanted to do so. ByteCog uses multiple methods of analyzing and classifying samples given to it, such as using Shannon Entropy to give a visual aspect for the researchers to look at while analyzing the code and finding possible readable code/text in a sample. ByteCog also uses Hausdorff Distance to calculate a 'raw similarity' value based on the difference in the entropy graphs of both samples, and finally ByteCog uses Jaro-Winkler Distance to calculate the 'true similarity' since the Hausdorff Distance will in most cases return a very high value if the sample is mostly the same entropy wise, so the Jaro-Winkler Distance is used to 'adjust' the simliarity value for this case of a sample.

Requirements

  • A python installation above 3.5+, which you can download from the official python website here.

Installation

Clone this repository to your local machine by following these instructions layed out here

Then proceed to download the dependencies file by running the following line in your console window

pip install -r requirements.txt

Usage

======================================================
|      ____          __         ______               |
|     / __ ) __  __ / /_ ___   / ____/____   ____    |
|    / __  |/ / / // __// _ \ / /    / __ \ / __ \   |
|   / /_/ // /_/ // /_ /  __// /___ / /_/ // /_/ /   |
|  /_____/ \__, / \__/ \___/ \____/ \____/ \__, /    |
|         /____/                          /____/     |
|                                                    |
|                    Version: 0.4                    |
|               Author: IlluminatiFish               |
======================================================

usage: bytecog.py [-h] -k KNOWN -u UNKNOWN -i IDENTIFIER -v VISUAL

Determine whether an unknown provided sample is similar to a known sample

optional arguments:
  -h, --help            show this help message and exit
  -k KNOWN, --known KNOWN
                        The file path to the known sample
  -u UNKNOWN, --unknown UNKNOWN
                        The file path to the unknown sample
  -i IDENTIFIER, --identifier IDENTIFIER
                        The antivirus identifier of the known file
  -v VISUAL, --visual VISUAL
                        If you want to show a visual representation of the file entropy

Features & Use Cases

  • Calculates sample similarity
  • Generates chunked entropy graph
  • Able to possibly detect malicious and benign software samples

Screenshots

Chunked Entropy Graph
chunk_entropy_graph

Output of ByteCog
bytecog_output

ByteCog Log File
bytecog log file

License

ByteCog - A way to analyse how malware and/or goodware samples vary from each other using Shannon Entropy, Hausdorff Distance and Jaro-Winkler Distance Copyright (c) 2021 IlluminatiFish

This program is free software; you can redistribute it and/or modify the code base under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but without ANY warranty; without even the implied warranty of merchantability or fitness for a particular purpose. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program. If not, see http://www.gnu.org/licenses/

Acknowledgements

  • Using a modified version of @venkat-abhi's Shannon Entropy calculator to work with my project script, you can find the original one here.

  • Using the fastest method to get maximum key from a dictionary using this snippet here.

References

Entropy Wiki
Jaro-Winkler Distance Wiki
Hausdorff Distance Wiki
Shannon Calculator
Referenced Article #1
Referenced Paper #1
Referenced Paper #2
Referenced Paper #3

Owner
I am a developer that has a passion for programming, mathematics and cyber security. Currently Developer @South-Hollow
xray多线程批量扫描工具

Auto_xray xray多线程批量扫描工具 简介 xray社区版貌似没有批量扫描,这就让安服仔使用起来很不方便,扫站得一个个手动添加,非常难受 Auto_xray目录下记得放xray,就跟平时一样的。 选项1:oneforall+xray 输入一个主域名,自动采集子域名然后添加到xray任务列表

1frame 13 Nov 09, 2022
Herramienta para descargar eventos de Sucuri WAF hacia disco.

Descarga los eventos de Sucuri Script para descargar los eventos del Sucuri Web Application Firewall (WAF) en el disco como archivos CSV. Requerimient

CSIRT-RD 2 Nov 29, 2021
Dark-Fb No Login 100% safe

Dark-Fb No Login 100% safe TERMUX • pkg install python2 && git -y • pip2 install requests mechanize tqdm • git clone https://github.com/BOT-033/Sensei

Bukan Hamkel 1 Dec 04, 2021
Apache Flink 目录遍历漏洞批量检测 (CVE-2020-17519)

使用方法&免责声明 该脚本为Apache Flink 目录遍历漏洞批量检测 (CVE-2020-17519)。 使用方法:Python CVE-2020-17519.py urls.txt urls.txt 中每个url为一行,漏洞地址输出在vul.txt中 影响版本: Apache Flink 1

45 Sep 21, 2022
A web-app helping to create strong passwords that are easy to remember.

This is a simple Web-App that demonstrates a method of creating strong passwords that are still easy to remember. It also provides time estimates how long it would take an attacker to crack a passwor

2 Jun 04, 2021
Recon is a script to perform a full recon on a target with the main tools to search for vulnerabilities.

👑 Recon 👑 The step of recognizing a target in both Bug Bounties and Pentest can be very time-consuming. Thinking about it, I decided to create my ow

Dirso 171 Dec 31, 2022
This script checks for any possible SSRF dns/http interactions in xmlrpc.php pingback feature

rpckiller This script checks for any possible SSRF dns/http interactions in xmlrpc.php pingback feature and with that you can further try to escalate

Ashish Kunwar 33 Sep 23, 2022
log4j2 dos exploit,CVE-2021-45105 exploit,Denial of Service poc

说明 about author: 我超怕的 blog: https://www.cnblogs.com/iAmSoScArEd/ github: https://github.com/iAmSOScArEd/ date: 2021-12-20 log4j2 dos exploit log4j2 do

3 Aug 13, 2022
A Docker based LDAP RCE exploit demo for CVE-2021-44228 Log4Shell

log4j-poc An LDAP RCE exploit for CVE-2021-44228 Log4Shell Description This demo Tomcat 8 server has a vulnerable app deployed on it and is also vulne

60 Dec 10, 2022
Log4jake works by spidering a web application for GET/POST requests

Log4jake Log4jake works by spidering a web application for GET/POST requests. It will then automatically execute the GET/POST requests, filling any di

16 May 09, 2022
A Fast Broken Link Hijacker Tool written in Python

Broken Link Hijacker BrokenLinkHijacker(BLH) is a Fast Broken Link Hijacker Tool written in Python.

Mayank Pandey 70 Nov 30, 2022
An All-In-One Pure Python PoC for CVE-2021-44228

Python Log4RCE An all-in-one pure Python3 PoC for CVE-2021-44228. Configure Replace the global variables at the top of the script to your configuratio

Alexandre Lavoie 178 Nov 09, 2022
This Repository is an up-to-date version of Harvard nlp's Legacy code and a Refactoring of the jupyter notebook version as a shell script version.

This Repository is an up-to-date version of Harvard nlp's Legacy code and a Refactoring of the jupyter notebook version as a shell script version.

신재욱 17 Sep 25, 2022
web指纹识别工具

前言 一直苦于没有用的顺手的web指纹识别工具,学习前辈s7ckTeam的Glass和broken5的WebAliveScan优秀开源程序开发的轻量型web指纹工具。

EASY 966 Dec 26, 2022
A secure way of storing your passwords.

StrongBox 🔐 A secure way of storing your passwords. 🔑 Why to use StrongBox? StrongBox makes it possible to have a random generated strong password i

Dylan Tintenfich 5 Dec 25, 2021
A collection of write-ups and solutions for Cyber FastTrack Spring 2021.

IMPORTANT: Please contact us before you use any styling or content shown here! Cyber FastTrack Spring 2021 / National Cyber Scholarship Competition -

Alice 48 Aug 28, 2022
Python script to tamper with pages to test for Log4J Shell vulnerability.

log4jShell Scanner This shell script scans a vulnerable web application that is using a version of apache-log4j 2.15.0. This application is a static

GoVanguard 8 Oct 20, 2022
This project is for finding a solution to use Security Onion Elastic data with Jupyter Notebooks.

This project is for finding a solution to use Security Onion Elastic data with Jupyter Notebooks. The goal is to successfully use this notebook project below with Security Onion for beacon detection

4 Jun 08, 2022
log4j burp scanner

log4jscanner log4j burp插件 特点如下: 0x01 基于Cookie字段、XFF头字段、UA头字段发送payload 0x02 基于域名的唯一性,将host带入dnslog中 插件主要识别五种形式: 1.get请求,a=1&b=2&c=3 2.post请求,a=1&b=2&c=

1 Jun 30, 2022
Simple python script for generating custom high-secure passwords for securing your social-apps ❤️

Opensource Project Simple Python Password Generator This repository is just for peoples who want to generate strong-passwords for there social-account

K A R T H I K 15 Dec 01, 2022