Various capabilities for static malware analysis.

Overview

Malchive

The malchive serves as a compendium for a variety of capabilities mainly pertaining to malware analysis, such as scripts supporting day to day binary analysis and decoder modules for various components of malicious code.

The goals behind the 'malchive' are to:

  • Allow teams to centralize efforts made in this realm and enforce communication and continuity
  • Have a shared corpus of tools for people to build on
  • Enforce clean coding practices
  • Allow others to interface with project members to develop their own capabilities
  • Promote a positive feedback loop between Threat Intel and Reverse Engineering staff
  • Make static file analysis more accessible
  • Serve as a vehicle to communicate the unique opportunity space identified via deep dive analysis

Documentation

At its core, malchive is a bunch of standalone scripts organized in a manner that the authors hope promotes the project's goals.

To view the documentation associated with this project, checkout the wiki page!

Scripts within the malchive are split up into the following core categories:

  • Utilities - These scripts may be run standalone to assist with static binary analysis or as modules supporting a broader program. Utilities always have a standalone component.
  • Helpers - These modules primarily serve to assist components in one or more of the other categories. They generally do not have a stand-alone component and instead serve the intents of those that do.
  • Binary Decoders - The purpose of scripts in this category is to retrieve, decrypt, and return embedded data (typically inside malware).
  • Active Discovery - Standalone scripts designed to emulate a small portion of a malware family's protocol for the purposes of discovering active controllers.

Installation

The malchive is a packaged distribution that is easily installed and will automatically create console stand-alone scripts.

Steps

You will need to install some dependencies for some of the required Python modules to function correctly.

  • First do a source install of YARA and make sure you compile using --dotnet
  • Next source install the YARA Python package.
  • Ensure you have sqlite3-dev installed
    • Debian: libsqlite3-dev
    • Red Hat: sqlite-devel / pip install pysqlite3

You can then clone the malchive repo and install...

  • pip install . when in the parent directory.
  • To remove, just pip uninstall malchive

Scripts

Console scripts stemming from utilities are appended with the prefix malutil, decoders are appended with maldec, and active discovery scripts are appended with maldisc. This allows for easily identifiable malchive scripts via tab autocompletion.

; running superstrings from cmd line
malutil-superstrings 1.exe -ss
0x9535 (stack) lstrlenA
0x9592 (stack) GetFileSize
0x95dd (stack) WriteFile
0x963e (stack) CreateFileA
0x96b0 (stack) SetFilePointer
0x9707 (stack) GetSystemDirectoryA

; running a decoder from cmd line
maldec-pivy test.exe_
{
    "MD5": "2973ee05b13a575c06d23891ab83e067",
    "Config": {
        "PersistActiveSetupName": "StubPath",
        "DefaultBrowserKey": "SOFTWARE\\Classes\\http\\shell\\open\\command",
        "PersistActiveSetupKeyPart": "Software\\Microsoft\\Active Setup\\Installed Components\\",
        "ServerId": "TEST - WIN_XP",
        "Callbacks": [
            {
                "callback": "192.168.1.104",
                "protocol": "Direct",
                "port": 3333
            },
            {
                "callback": "192.168.1.111",
                "protocol": "Direct",
                "port": 4444
            }
        ],
        "ProxyCfgPresent": false,
        "Password": "test$321$",
        "Mutex": ")#V0qA.I4",
        "CopyAsADS": true,
        "Melt": true,
        "InjectPersist": true,
        "Inject": true
    }
}

; cmd line use with other common utilities
echo -ne 'eJw9kLFuwzAMRIEC7ZylrVGgRSFZiUbBZmwqsMUP0VfcnuQn+rMde7KLTBIPj0ce34tHyMUJjrnw
p3apz1kicjoJrDRlQihwOXmpL4RmSR5qhEU9MqvgWo8XqGMLJd+sKNQPK0dIGjK+e5WANIT6NeOs
k2mI5NmYAmcrkbn4oLPK5gZX+hVlRoKloMV20uQknv2EPunHKQtcig1cpHY4Jodie5pRViV+rp1t
629J6Dyu4hwLR97LINqY5rYILm1hhlvinoyJZavOKTrwBHTwpZ9yPSzidUiPt8PUTkZ0FBfayWLp
a71e8U8YDrbtu0aWDj+/eBOu+jRkYabX+3hPu9LZ5fb41T+7fmRf' | base64 -d | zlib-flate -uncompress | malutil-xor - [KEY]

Interfacing

Utilities, decoders, and discovery scripts in this collection are designed to support single ad-hoc analysis as well as inclusion into other frameworks. After installation, the malchive should be part of your Python path. At this point accessing any of the scripts is straight forward.

Here are a few examples:

; accessing decoder modules
import sys
from malchive.decoders import testdecoder

p = testdecoder.GetConfig(open(sys.argv[1], 'rb').read())
print('password', p.rc4_key)
for c in p.callbacks:
    print('c2 address', c)

; accessing utilities
from malchive.utilities import xor
ret = xor.GenericXor(buff=b'testing', key=[0x51], count=0xff)
print(ret.run_crypt())

; accessing helpers
from malchive.helpers import winfunc
key = winfunc.CryptDeriveKey(b'testdatatestdata')

To understand more about a given module, see the associated wiki entry.

Contributing

Contributing to the malchive is easy, just ensure the following requirements are met:

  • When writing utilities, decoders, or discovery scripts, consider using the available templates or review existing code if you're not sure how to get started.
  • Make sure modification or contributions pass pre-commit tests.
  • Ensure the contribution is placed in one of the component folders.
  • Updated the setup file if needed with an entry.
  • Python3 is a must.

Legal

©2021 The MITRE Corporation. ALL RIGHTS RESERVED.

Approved for Public Release; Distribution Unlimited. Public Release Case Number 21-0153

Owner
MITRE Cybersecurity
MITRE Cybersecurity
A fast and easy implementation of Transformer with PyTorch.

FasySeq FasySeq is a shorthand as a Fast and easy sequential modeling toolkit. It aims to provide a seq2seq model to researchers and developers, which

宁羽 7 Jul 18, 2022
A python framework to transform natural language questions to queries in a database query language.

__ _ _ _ ___ _ __ _ _ / _` | | | |/ _ \ '_ \| | | | | (_| | |_| | __/ |_) | |_| | \__, |\__,_|\___| .__/ \__, | |_| |_| |___/

Machinalis 1.2k Dec 18, 2022
pyMorfologik MorfologikpyMorfologik - Python binding for Morfologik.

Python binding for Morfologik Morfologik is Polish morphological analyzer. For more information see http://github.com/morfologik/morfologik-stemming/

Damian Mirecki 18 Dec 29, 2021
Just a basic Telegram AI chat bot written in Python using Pyrogram.

Nikko ChatBot Just a basic Telegram AI chat bot written in Python using Pyrogram. Requirements Python 3.7 or higher. A bot token. Installation $ https

ʀᴇxɪɴᴀᴢᴏʀ 2 Oct 21, 2022
NLTK Source

Natural Language Toolkit (NLTK) NLTK -- the Natural Language Toolkit -- is a suite of open source Python modules, data sets, and tutorials supporting

Natural Language Toolkit 11.4k Jan 04, 2023
Training code for Korean multi-class sentiment analysis

KoSentimentAnalysis Bert implementation for the Korean multi-class sentiment analysis 왜 한국어 감정 다중분류 모델은 거의 없는 것일까?에서 시작된 프로젝트 Environment: Pytorch, Da

Donghoon Shin 3 Dec 02, 2022
Jarvis is a simple Chatbot with a GUI capable of chatting and retrieving information and daily news from the internet for it's user.

J.A.R.V.I.S Kindly consider starring this repository if you like the program :-) What/Who is J.A.R.V.I.S? J.A.R.V.I.S is an chatbot written that is bu

Epicalable 50 Dec 31, 2022
A python package for deep multilingual punctuation prediction.

This python library predicts the punctuation of English, Italian, French and German texts. We developed it to restore the punctuation of transcribed spoken language.

Oliver Guhr 27 Dec 22, 2022
Learn meanings behind words is a key element in NLP. This project concentrates on the disambiguation of preposition senses. Therefore, we train a bert-transformer model and surpass the state-of-the-art.

New State-of-the-Art in Preposition Sense Disambiguation Supervisor: Prof. Dr. Alexander Mehler Alexander Henlein Institutions: Goethe University TTLa

Dirk Neuhäuser 4 Apr 06, 2022
CoNLL-English NER Task (NER in English)

CoNLL-English NER Task en | ch Motivation Course Project review the pytorch framework and sequence-labeling task practice using the transformers of Hu

Kevin 2 Jan 14, 2022
Tools to download and cleanup Common Crawl data

cc_net Tools to download and clean Common Crawl as introduced in our paper CCNet. If you found these resources useful, please consider citing: @inproc

Meta Research 483 Jan 02, 2023
translate using your voice

speech-to-text-translator Usage translate using your voice description this project makes translating a word easy, all you have to do is speak and...

1 Oct 18, 2021
Sequence modeling benchmarks and temporal convolutional networks

Sequence Modeling Benchmarks and Temporal Convolutional Networks (TCN) This repository contains the experiments done in the work An Empirical Evaluati

CMU Locus Lab 3.5k Jan 03, 2023
A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)

MMF is a modular framework for vision and language multimodal research from Facebook AI Research. MMF contains reference implementations of state-of-t

Facebook Research 5.1k Dec 26, 2022
PG-19 Language Modelling Benchmark

PG-19 Language Modelling Benchmark This repository contains the PG-19 language modeling benchmark. It includes a set of books extracted from the Proje

DeepMind 161 Oct 30, 2022
Machine learning models from Singapore's NLP research community

SG-NLP Machine learning models from Singapore's natural language processing (NLP) research community. sgnlp is a Python package that allows you to eas

AI Singapore | AI Makerspace 21 Dec 17, 2022
100+ Chinese Word Vectors 上百种预训练中文词向量

Chinese Word Vectors 中文词向量 中文 This project provides 100+ Chinese Word Vectors (embeddings) trained with different representations (dense and sparse),

embedding 10.4k Jan 09, 2023
Code for ACL 2022 main conference paper "STEMM: Self-learning with Speech-text Manifold Mixup for Speech Translation".

STEMM: Self-learning with Speech-Text Manifold Mixup for Speech Translation This is a PyTorch implementation for the ACL 2022 main conference paper ST

ICTNLP 29 Oct 16, 2022
TFIDF-based QA system for AIO2 competition

AIO2 TF-IDF Baseline This is a very simple question answering system, which is developed as a lightweight baseline for AIO2 competition. In the traini

Masatoshi Suzuki 4 Feb 19, 2022
Speech Recognition for Uyghur using Speech transformer

Speech Recognition for Uyghur using Speech transformer Training: this model using CTC loss and Cross Entropy loss for training. Download pretrained mo

Uyghur 11 Nov 17, 2022