A way to analyse how malware and/or goodware samples vary from each other using Shannon Entropy, Hausdorff Distance and Jaro-Winkler Distance

Overview

A way to analyse how malware and/or goodware samples vary from each other using Shannon Entropy, Hausdorff Distance and Jaro-Winkler Distance

Python version Project version Codacy Grade


UsageDownload

Introduction

ByteCog is a python script that aims to help security researchers and others a like to classify malicious software compared to other samples, depending on what the unknown file(s) is/are being tested against. This script can be extended to use a machine learning model to classify malware if you wanted to do so. ByteCog uses multiple methods of analyzing and classifying samples given to it, such as using Shannon Entropy to give a visual aspect for the researchers to look at while analyzing the code and finding possible readable code/text in a sample. ByteCog also uses Hausdorff Distance to calculate a 'raw similarity' value based on the difference in the entropy graphs of both samples, and finally ByteCog uses Jaro-Winkler Distance to calculate the 'true similarity' since the Hausdorff Distance will in most cases return a very high value if the sample is mostly the same entropy wise, so the Jaro-Winkler Distance is used to 'adjust' the simliarity value for this case of a sample.

Requirements

  • A python installation above 3.5+, which you can download from the official python website here.

Installation

Clone this repository to your local machine by following these instructions layed out here

Then proceed to download the dependencies file by running the following line in your console window

pip install -r requirements.txt

Usage

======================================================
|      ____          __         ______               |
|     / __ ) __  __ / /_ ___   / ____/____   ____    |
|    / __  |/ / / // __// _ \ / /    / __ \ / __ \   |
|   / /_/ // /_/ // /_ /  __// /___ / /_/ // /_/ /   |
|  /_____/ \__, / \__/ \___/ \____/ \____/ \__, /    |
|         /____/                          /____/     |
|                                                    |
|                    Version: 0.4                    |
|               Author: IlluminatiFish               |
======================================================

usage: bytecog.py [-h] -k KNOWN -u UNKNOWN -i IDENTIFIER -v VISUAL

Determine whether an unknown provided sample is similar to a known sample

optional arguments:
  -h, --help            show this help message and exit
  -k KNOWN, --known KNOWN
                        The file path to the known sample
  -u UNKNOWN, --unknown UNKNOWN
                        The file path to the unknown sample
  -i IDENTIFIER, --identifier IDENTIFIER
                        The antivirus identifier of the known file
  -v VISUAL, --visual VISUAL
                        If you want to show a visual representation of the file entropy

Features & Use Cases

  • Calculates sample similarity
  • Generates chunked entropy graph
  • Able to possibly detect malicious and benign software samples

Screenshots

Chunked Entropy Graph
chunk_entropy_graph

Output of ByteCog
bytecog_output

ByteCog Log File
bytecog log file

License

ByteCog - A way to analyse how malware and/or goodware samples vary from each other using Shannon Entropy, Hausdorff Distance and Jaro-Winkler Distance Copyright (c) 2021 IlluminatiFish

This program is free software; you can redistribute it and/or modify the code base under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but without ANY warranty; without even the implied warranty of merchantability or fitness for a particular purpose. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program. If not, see http://www.gnu.org/licenses/

Acknowledgements

  • Using a modified version of @venkat-abhi's Shannon Entropy calculator to work with my project script, you can find the original one here.

  • Using the fastest method to get maximum key from a dictionary using this snippet here.

References

Entropy Wiki
Jaro-Winkler Distance Wiki
Hausdorff Distance Wiki
Shannon Calculator
Referenced Article #1
Referenced Paper #1
Referenced Paper #2
Referenced Paper #3

Owner
I am a developer that has a passion for programming, mathematics and cyber security. Currently Developer @South-Hollow
telegram bug that discloses user's hidden phone number (still unpatched) (exploit included)

CVE-2019-15514 Type: Information Disclosure Affected Users, Versions, Devices: All Telegram Users Still not fixed/unpatched. brute.py is available exp

Gray Programmerz 66 Dec 08, 2022
the metasploit script(POC) about CVE-2021-36260

CVE-2021-36260-metasploit the metasploit script(POC) about CVE-2021-36260. A command injection vulnerability in the web server of some Hikvision produ

Taroballz 14 Nov 09, 2022
Um keylogger que se disfarça de um app que tira print da tela.

Keylogger_ Um keylogger que se disfarça de um app que tira print da tela. Este programa captura o print da tela e salva ,normalmente, na pasta Picture

Marcus Vinícius Ribeiro Andrade 1 Dec 03, 2021
A Safer PoC for CVE-2022-22965 (Spring4Shell)

Safer_PoC_CVE-2022-22965 A Safer PoC for CVE-2022-22965 (Spring4Shell) Functionality Creates a file called CVE_2022-22965_exploited.txt in the tomcat

Colin Cowie 46 Nov 12, 2022
This repository uses a mixture of numbers, alphabets, and other symbols found on the computer keyboard

This repository uses a mixture of numbers, alphabets, and other symbols found on the computer keyboard to form a 16-character password which is unpredictable and cannot easily be memorised.

Mohammad Shaad Shaikh 1 Nov 23, 2021
Tool to check if your DNS comply to Polish Ministry of Finance gambling domains restrictions

dns-mf-hazard Tool to check if your DNS comply to Polish Ministry of Finance gambling domains restrictions How to use it? Installation You need python

Marek Wajdzik 2 Jan 01, 2022
This project is for finding a solution to use Security Onion Elastic data with Jupyter Notebooks.

This project is for finding a solution to use Security Onion Elastic data with Jupyter Notebooks. The goal is to successfully use this notebook project below with Security Onion for beacon detection

4 Jun 08, 2022
Deobfuscate Log4Shell payloads with ease

Ox4Shell Deobfuscate Log4Shell payloads with ease. Description Since the release

Oxeye 137 Jan 02, 2023
zip-brute Zip File Password Cracking with Using Password List

Zip brute is a python script that cracks zip that are password protected using a wordlist dictionary.

AnonyminHack5 13 Nov 03, 2022
A python script to bypass 403-forbidden.

4nought3 A python script to bypass 403-forbidden. It covers methods like Host-Header Injections, Changing HTTP Requests Methods and URL-Injections. Us

11 Aug 27, 2022
POC for detecting the Log4Shell (Log4J RCE) vulnerability.

log4shell-poc-py POC for detecting the Log4Shell (Log4J RCE) vulnerability. Run on a system with python3 python3 log4shell-poc.py pathToTargetFile

BCC Risk Advisory 2 Dec 22, 2021
automatically crawl every URL and find cross site scripting (XSS)

scancss Fastest tool to find XSS. scancss is a fastest tool to detect Cross Site scripting (XSS) automatically and it's also an intelligent payload ge

Md. Nur habib 30 Sep 24, 2022
Selamat Datang DiTools Crack-Old, Crack Old Adalah Sebuah Crack Tanpa Login Dan Crack Menggunakan Akun Facebook Tua/Old.

Selamat Datang DiTools Crack-Old, Crack Old Adalah Sebuah Crack Tanpa Login Dan Crack Menggunakan Akun Facebook Tua/Old. ([Welcome to Crack-Old Tools, Old Crack Is A Crack Without Login And Crack Usi

Risky [ Zero Tow ] 7 Dec 25, 2022
SCodeScanner stands for Source Code scanner where the user can scans the source code for finding the Critical Vulnerabilities.

The SCodeScanner stands for Source Code Scanner, where you can scan your source code files like PHP and get identify the vulnerabilities inside it. The tool can use by Pentester, Developer to quickly

136 Dec 13, 2022
This is a multi-password‌ cracking tool that can help you hack facebook accounts very quickly

Pro_Crack Facebook Fast Cracking Tool This is a multi-password‌ cracking tool that can help you hack facebook accounts very quickly Installation On Te

•JINN• 1 Jan 16, 2022
#whois it? Let's find out!

whois_bot #whois it? Let's find out! Currently in development: a gatekeeper bot for a community (https://t.me/IT_antalya) of 250+ expat IT pros of Ant

Kirill Nikolaev 14 Jun 24, 2022
GitHub Advance Security Compliance Action

advanced-security-compliance This Action was designed to allow users to configure their Risk threshold for security issues reported by GitHub Code Sca

Mathew Payne 121 Dec 14, 2022
RCE 0-day for GhostScript 9.50 - Payload generator

RCE-0-day-for-GhostScript-9.50 PoC for RCE 0-day for GhostScript 9.50 - Payload generator The PoC in python generates payload when exploited for a 0-d

534 Dec 14, 2022
Automatically download all 10,000 CryptoPunk NFTs.

CryptoPunk Stealer The sole purpose of this script is to download the entire CryptoPunk NFT collection. How does it work? Basically, the website where

Dan 7 Oct 22, 2022
A honeypot for the Log4Shell vulnerability (CVE-2021-44228)

Log4Pot A honeypot for the Log4Shell vulnerability (CVE-2021-44228). License: GPLv3.0 Features Listen on various ports for Log4Shell exploitation. Det

Thomas Patzke 79 Dec 27, 2022