A plugin to introduce a generic API for Decompiler support in GEF

Overview

decomp2gef

A plugin to introduce a generic API for Decompiler support in GEF. Like GEF, the plugin is battery-included and requires no external dependencies other than Python.

decomp2gef Demo viewable here.

Quick Start

First, install the decomp2gef plugin into gef:

cp decomp2gef.py ~/.decomp2gef.py && echo "source ~/.decomp2gef.py" >> ~/.gdbinit

Alternatively, you can load it for one-time-use inside gdb with:

source /path/to/decomp2gef.py

Now import the relevant script for you decompiler:

IDA

  • open IDA on your binary and press Alt-F7
  • popup "Run Script" will appear, load the decomp2gef_ida.py script from this repo

Now use the decompiler connect command in GDB. Note: you must be in a current session of debugging something.

Usage

In gdb, run:

decompiler connect ida

If all is well, you should see:

[+] Connected to decompiler!

Now just use GEF like normal and enjoy decompilation and decompiler symbol mapping! When you change a symbol in ida, like a function name, if will be automatically reflected in gdb after just 2 steps!

Features

  • Auto-updating decompilation context view
  • Auto-syncing function names
  • Breakable/Inspectable symbols
  • Auto-syncing stack variable names
  • Auto-syncing structs

Abstract

The reverse engineering process often involves a decompiler making it fundamental to support in a debugger since context switching knowledge between the two is hard. Decompilers have a lot in common. During the reversing process there are reverse engineering artifacts (REA). These REAs are common across all decompilers:

  • stack variables
  • global variables
  • structs
  • enums
  • function headers (name and prototype)
  • comments

Knowledge of REAs can be used to do lots of things, like sync REAs across decompilers or create a common interface for a debugger to display decompilation information. GEF is currently one of the best gdb upgrades making it a perfect place to first implement this idea. In the future, it should be easily transferable to any debugger supporting python3.

Adding your decompiler

To add your decompiler, simply make a Python XMLRPC server that implements the 4 server functions found in the decomp2gef Decompiler class. Follow the code for how to return correct types.

Comments
  • Missing or invalid attribute 'comment' on Windows IDA 7.6

    Missing or invalid attribute 'comment' on Windows IDA 7.6

    IDA version 7.6 Python 3.10

    decomp2gef ida script seems to fail at Z:\...\IDA\IDA Pro 7.6\plugins\decomp2gef_ida.py: Missing or invalid attribute 'comment' and I seem to be unable to see the decomp2gef plugin being loaded successfully. any ideas?

      bytes   pages size description
    --------- ----- ---- --------------------------------------------
       532480    65 8192 allocating memory for b-tree...
       507904    62 8192 allocating memory for virtual array...
       262144    32 8192 allocating memory for name pointers...
    -----------------------------------------------------------------
      1302528            total memory allocated
    
    Loading processor module Z:\...\IDA\IDA Pro 7.6\procs\pc.dll for metapc...Initializing processor module metapc...OK
    Loading type libraries...
    Autoanalysis subsystem has been initialized.
    Z:\...\IDA\IDA Pro 7.6\plugins\decomp2gef_ida.py: Missing or invalid attribute 'comment'
    Database for file 'redacted' has been loaded.
    Hex-Rays Decompiler plugin has been loaded (v7.6.0.210427)
      License: 57-631C-7A2B-72 IDA PRO 7.6 SP1 (99 users)
      The hotkeys are F5: decompile, Ctrl-F5: decompile all.
    
      Please check the Edit/Plugins menu for more informaton.
    Z:\...\IDA\IDA Pro 7.6\plugins\decomp2gef_ida.py: Missing or invalid attribute 'comment'
    805EF50: restored microcode from idb
    805EF50: restored pseudocode from idb
    -----------------------------------------------------------------------------------------
    Python 3.10.1 (tags/v3.10.1:2cd268a, Dec  6 2021, 19:10:37) [MSC v.1929 64 bit (AMD64)] 
    IDAPython v7.4.0 final (serial 0) (c) The IDAPython Team <[email protected]>
    -----------------------------------------------------------------------------------------
    
    bug 
    opened by caprinux 10
  • ELF failes to build with objcopy on some binaries

    ELF failes to build with objcopy on some binaries

    Found out about this decomp2gef script and was eager to try it out :)

    When trying it out, I encountered this error: image

    I'm currently using GDB 11.1, IDA 7.6 together with the latest gef.py script freshly pulled from the gef github. Any clue how I could fix this?

    bug 
    opened by caprinux 8
  • Improve code logic and fix errors

    Improve code logic and fix errors

    This PR aims to address a few issues in decomp2gef

    First issue addressed

    As of now, if you try to connect decompiler to debug a binary over a remote server via gdbserver, chances are you will encounter the error min() arg is an empty sequence which arises due to the following lines

    base_address = min([x.page_start for x in vmmap if x.path == get_filepath()])
    ...
    text_base = min([x.page_start for x in vmmap if x.path == get_filepath()])
    

    This arises due to inconsistency in the exact path of the binary between the remote server and the local binary.

    For example, if I try to debug a binary on remote with file path /bin/program with the local binary in /tmp/program, x.path will return /bin/program and get_filepath() will return /tmp/program which does not match and causes the array to be empty.\

    Hence a slight modification to compare the file name rather than the absolute path will solve this issue.

    base_address = min([x.page_start for x in vmmap if x.path.split('/')[-1] == get_filename()])
    ...
    text_base = min([x.page_start for x in vmmap if x.path.split('/')[-1] == get_filename()])
    

    Second issue addressed

    def update_function_data(self, addr):
        ...
        for idx, arg in args.item():
            idx = int(idx, 0)
            expr = f"""(({arg['type']}) {current_arch.function_parameters[idx]}"""
        ...
    

    Within the update_function_data(), we use current_arch.function_parameters to get the registers in which our function arguments are stored and use IDX to match the argument to the respective register/place in memory where the argument is stored.

    In x86_64, current_arch.function_parameters = ['$rdi', '$rsi', '$rdx', '$rcx', '$r8', '$r9'] and this works for functions with 6 arguments as current_arch.function_parameters[0] will match argument 1 to $rdi and so on.

    The flaw comes when we fail to consider functions with 7 or more arguments where arguments will then be found in the stack.

    However we can set this aside, for now, we don't usually encounter more than 7 arguments, right?

    The more urgent flaw is when we bring in the X86 architecture.

    In x86, current_arch.function_parameters = ['$esp']. This means that beyond the first argument, decomp2gef will break with an IndexError as it tries to access current_arch.function_parameters[1] for the 2nd argument and so on.

    I'm not sure if there's a nice way to do this but I essentially redefined current_arch.function_parameters for X86 architectures.

    current_arch.function_parameters = [f'$esp+{x}' for x in range(0, 28, 4)]
    

    This allows decomp2gef to work, but I haven't considered implications yet as it may get pretty complicated(?)

    I welcome any ideas!!

    opened by caprinux 7
  • invalid string offset for section `.strtab'

    invalid string offset for section `.strtab'

    When connecting GEF to the decompiler, gdb fails to add-symbol-file and throws an error.

    BFD: /tmp/tmp3pmay68f.c.debug: invalid string offset 16777215 >= 282 for section '.strtab'

    Although this does not break decomp2gef, it causes the debugger to be without a symbol file and hence renders some features unusable.

    Sample binary with this behavior: sample_program.zip

    bug 
    opened by caprinux 6
  • Add proper support for attaching

    Add proper support for attaching

    Currently support for using the attach command is shaky with PIE binaries, and requires the strict process of launching a new gdb instances, attaching to the target process id, and then connecting to decomp2dbg. Attempting to attach again after this will cause the attached-to binary to have a new base address, and I assume this prevents decomp2dbg from functioning properly as I lose symbols after that. On my local system, disconnecting and reconnecting doesn't fix this (if done before or after the binary is attached to for the second time).

    enhancement 
    opened by frqmod 2
  • add instruction for wsl2

    add instruction for wsl2

    I suggest to add instruction for those who want to use this tool in wsl.

    1. run ./install.sh --ida /mnt/c/xxx/IDA/plugins to install
    2. listen to 0.0.0.0:3662 in IDA
    3. add an Inbound Rules for port 3662 in Windows Firewall, private or domain network
    4. run decompiler connect ida 192.168.xxx.xxx(LAN IP) 3662 in gdb to connect
    opened by RoderickChan 1
  • add checks when forcing text size

    add checks when forcing text size

    Previously, we force text_size to 0xFFFFFF with a nasty hack which only works on 64 bit binaries.

    This means that 32-bit binaries will throw a .strtab offset error or something along those times most of the time which is rather ugly.

    Hence we implement a bit check on the binary and enforce the hack if binary is 64-bit. We should definitely implement a nicer method that caters to 32-bit binaries when possible, but until then these will suffice to preserve sanity.

    addresses #14 !!

    opened by caprinux 1
  • Fix symbol size offsets/Designate appropriate symbol size to symbols

    Fix symbol size offsets/Designate appropriate symbol size to symbols

    Addresses #15, took a while to debug but on comparing queued_sym_sizes to sym_info_list, they seemed to be correct and match accordingly.

    I tested this PR against very basic 32 and 64 bit binaries, which both seems to give me the appropriate symbol sizes upon calling readelf.

    Do have a look! Unfortunately, does not fix #14 :(

    opened by caprinux 1
  • Requires sortedcontainers

    Requires sortedcontainers

    decomp2gef actually does have a single dependency which I did not realize was a dependency: sortedcontainers. It's needed to create a fast and memory-friendly mapping for non-native symbols in gef. We should decide if we want to make decomp2gef dependent of some python packages, or try to replace the functionality of SortedDict.

    discussion 
    opened by mahaloz 1
  • Feat Request: Programmable Ports & IPs

    Feat Request: Programmable Ports & IPs

    As brought up by @caprinux (in #3), we don't support the ability to specify ports or ips for connecting GEF over. Currently, it's hardcoded to 3662.

    To allow for this, we will need a fundamental change in architecture for the server-side, since we need a way to specify port and IP.

    enhancement 
    opened by mahaloz 1
  • Register Variable Support

    Register Variable Support

    • we can now supper every variable shown in a decompiler (including the ones assigned to a variable)
    • functions args are being deprecated in favor of setting them through either register vars or stack vars
    • refactored some janky type setting code

    Closes #38

    opened by mahaloz 0
  • Stack Vars from IDA assigned to incorrect locations

    Stack Vars from IDA assigned to incorrect locations

    Here is a simple example I tested on decomp2dbg v3.1.3 and ida7.5:

    #include <stdio.h>
    
    int main()
    {
        int a = 1;
        int b;
        scanf("%d",&b);
        int c = a + b;
        printf("%d\n",c);
        return 0;
    }
    

    the disassembled result of ida is:

    int __cdecl main(int argc, const char **argv, const char **envp)
    {
      int v4; // [rsp+Ch] [rbp-14h] BYREF
      int v5; // [rsp+10h] [rbp-10h]
      int v6; // [rsp+14h] [rbp-Ch]
      unsigned __int64 v7; // [rsp+18h] [rbp-8h]
    
      v7 = __readfsqword(0x28u);
      v5 = 1;
      __isoc99_scanf(&unk_2004, &v4, envp);
      v6 = v4 + v5;
      printf("%d\n", (unsigned int)(v4 + v5));
      return 0;
    }
    

    when I executed to __isoc99_scanf(&unk_2004, &v4, envp);, I tried to show the value of v5, but it is different from $rbp - 0x10

    image

    Is this a bug or did I do something wrong?

    bug 
    opened by LioTree 2
  • binary ninja plugin manager?

    binary ninja plugin manager?

    The binary ninja plugin manager supports plugins that exist in subfolders. All it would take would be to tag and cut a release (or using release_helper) and let me know on this issue and I can add it. Then subsequent releases automatically notify us and we update the plugin manager accordingly.

    Note that I haven't looked at the current import hierarchy but because the plugin manager doesn't necessarily install things to the global namespace it might require some tweaks to how imports are done.

    opened by psifertex 1
  • Tab completion broken in Archlinux gdb

    Tab completion broken in Archlinux gdb

    After sourcing decomp2dbg in my gdbinit I have the commands available but it's not possible anymore to use tab-completion to look for help or alike. This happens with a naked gdb with source ~/.decomp2dbg.py as only entry.

    opened by ysf 6
  • Add Support for shared librairies.

    Add Support for shared librairies.

    Hello. Tested on binary it work perfect, great tools! But when i test on shared libraries started with another binary, after connecting to the server, the tool won't work. (no decompilation, breakpoints have offset errors) . I know original behavior is to start the server on binary and use the connect while debugging the binary with gdb. But in my situation I can't debug the libraries without starting the linked main binary before. (Maybe adding muliple server syncing ? Decomp2dbg can't export ida decompiled code on gdb because shared librairies is extern but maybe adding another syncing server on shared lib IDA instance + syncing correctly when jumping on shared lib can work)

    opened by 0xMirasio 16
  • Add Support for Struct Imports

    Add Support for Struct Imports

    For now, we will only support IDA since we have a clear-cut way to both get every struct and also know when they have been updated. This may also be possible in Binja, but out of question for future Ghidra support... that one will have to wait.

    IDA Changes

    In IDA we need to utilize finding all ordinal numbers, which represents each custom struct in IDA. After that, we can use idc.print_decls("1", 0) for each number to get a nice C representation of the struct. Now that we have a string that has the C-definition of the struct we need to do things in the core.

    The changes all take place in the server. It's possible this may change the old API.

    Client Changes

    Assuming we now have a series of structs that are represented in C, we actually need to compile them into an object file and then add them with the classic add-symbol-file we use on the backend for other things. The trick though is adding this symbol file before we add the big one with all the global symbols here: https://github.com/mahaloz/decomp2dbg/blob/57983617d9a14f1f2ed7b54ee07bd15f14075c45/decomp2dbg/clients/gdb/symbol_mapper.py#L243

    Since both symbol files will be loaded into the same place, there will be an overlapping main function. We either need to bake structs directly into the first file we create, or we need to make a new way to add native-struct through the symbol mapper.

    enhancement 
    opened by mahaloz 0
Releases(v3.1.3)
  • v3.1.3(Nov 21, 2022)

  • v3.1.2(Nov 21, 2022)

  • v3.1.1(Nov 18, 2022)

  • v3.1.0(Nov 17, 2022)

    What's Changed

    • Add Ghidra Demo & Refactor readme by @mahaloz in https://github.com/mahaloz/decomp2dbg/pull/47
    • Reflect config changes in pwndbg by @szymex73 in https://github.com/mahaloz/decomp2dbg/pull/48
    • Typo in README.md by @Ice1187 in https://github.com/mahaloz/decomp2dbg/pull/49

    New Contributors

    • @szymex73 made their first contribution in https://github.com/mahaloz/decomp2dbg/pull/48
    • @Ice1187 made their first contribution in https://github.com/mahaloz/decomp2dbg/pull/49

    Full Changelog: https://github.com/mahaloz/decomp2dbg/compare/v3.0.0...v3.1.0

    Source code(tar.gz)
    Source code(zip)
    d2d-ghidra-plugin.zip(1.41 MB)
  • v3.0.0(Oct 25, 2022)

    • Added Ghidra support
    • Refactored how installing works to full Python-only
    • Added dependence on the BinSync project

    What's Changed

    • Python Installer by @mahaloz in https://github.com/mahaloz/decomp2dbg/pull/46

    Full Changelog: https://github.com/mahaloz/decomp2dbg/compare/v2.2.0...v3.0.0

    Source code(tar.gz)
    Source code(zip)
    d2d-ghidra-plugin.zip(1.41 MB)
  • v2.2.0(Oct 25, 2022)

    What's Changed

    • Support native symbol mapping by @mahaloz in https://github.com/mahaloz/decomp2dbg/pull/1
    • added REAL native support by @mahaloz in https://github.com/mahaloz/decomp2dbg/pull/2
    • fail gracefully on bad argv & fix bad elfs reads by @mahaloz in https://github.com/mahaloz/decomp2dbg/pull/6
    • Add angrmanagement support for decomp2gef by @Cl4sm in https://github.com/mahaloz/decomp2dbg/pull/8
    • DRAFT: Local Var Support by @mahaloz in https://github.com/mahaloz/decomp2dbg/pull/7
    • Api refactor by @mahaloz in https://github.com/mahaloz/decomp2dbg/pull/9
    • Major Refactor: Global Vars, Programmable Ports, Packaging by @mahaloz in https://github.com/mahaloz/decomp2dbg/pull/10
    • make sure IDA Plugin has all consts defined by @mahaloz in https://github.com/mahaloz/decomp2dbg/pull/12
    • Support remote debugging by @mahaloz in https://github.com/mahaloz/decomp2dbg/pull/17
    • Fix rebasing bugs in angr-decompiler plugin by @mahaloz in https://github.com/mahaloz/decomp2dbg/pull/18
    • Update GEF API use to latest version by @mahaloz in https://github.com/mahaloz/decomp2dbg/pull/22
    • Fix symbol size offsets/Designate appropriate symbol size to symbols by @caprinux in https://github.com/mahaloz/decomp2dbg/pull/20
    • add checks when forcing text size by @caprinux in https://github.com/mahaloz/decomp2dbg/pull/21
    • minor fixes + replace all usage of GEF Elf object with pyelftools by @caprinux in https://github.com/mahaloz/decomp2dbg/pull/25
    • [WIP] Binja Support by @mahaloz in https://github.com/mahaloz/decomp2dbg/pull/19
    • fixed bad sizing on binaries that generate a larger blank symbol section by @mahaloz in https://github.com/mahaloz/decomp2dbg/pull/26
    • Fix symbol duplication by @mahaloz in https://github.com/mahaloz/decomp2dbg/pull/27
    • fix another duplication bug by @mahaloz in https://github.com/mahaloz/decomp2dbg/pull/28
    • [WIP] Support Vanilla GDB by @mahaloz in https://github.com/mahaloz/decomp2dbg/pull/30
    • Fix a typo that caused the plugin not to work for Python <3.8. by @adamdoupe in https://github.com/mahaloz/decomp2dbg/pull/32
    • Fix Manual Install Instructions by @adamdoupe in https://github.com/mahaloz/decomp2dbg/pull/31
    • Fix README typos by @mborgerson in https://github.com/mahaloz/decomp2dbg/pull/33
    • Handle stack frame offset for stack variables on x86 architectures by @zolutal in https://github.com/mahaloz/decomp2dbg/pull/35
    • always refresh baseaddr on connect by @mahaloz in https://github.com/mahaloz/decomp2dbg/pull/37
    • Register Variable Support by @mahaloz in https://github.com/mahaloz/decomp2dbg/pull/39
    • Support loading symbols at configurable base addresses by @mahaloz in https://github.com/mahaloz/decomp2dbg/pull/42
    • Ghidra Support by @mahaloz in https://github.com/mahaloz/decomp2dbg/pull/45

    New Contributors

    • @mahaloz made their first contribution in https://github.com/mahaloz/decomp2dbg/pull/1
    • @Cl4sm made their first contribution in https://github.com/mahaloz/decomp2dbg/pull/8
    • @caprinux made their first contribution in https://github.com/mahaloz/decomp2dbg/pull/20
    • @adamdoupe made their first contribution in https://github.com/mahaloz/decomp2dbg/pull/32
    • @mborgerson made their first contribution in https://github.com/mahaloz/decomp2dbg/pull/33
    • @zolutal made their first contribution in https://github.com/mahaloz/decomp2dbg/pull/35

    Full Changelog: https://github.com/mahaloz/decomp2dbg/commits/v2.2.0

    Source code(tar.gz)
    Source code(zip)
    d2d-ghidra-plugin.zip(1.41 MB)
Owner
Zion
Native Hawaiian | Phd Student @sefcom | Co-captain @shellphish | President of @asu-hacking-club
Zion
Spin-off Notice: the modules and functions used by our research notebooks have been refactored into another repository

Fecon235 - Notebooks for financial economics. Keywords: Jupyter notebook pandas Federal Reserve FRED Ferbus GDP CPI PCE inflation unemployment wage income debt Case-Shiller housing asset portfolio eq

Adriano 825 Dec 27, 2022
MkDocs Plugin allowing your visitors to *File > Print > Save as PDF* the entire site.

mkdocs-print-site-plugin MkDocs plugin that adds a page to your site combining all pages, allowing your site visitors to File Print Save as PDF th

Tim Vink 67 Jan 04, 2023
A swagger tool for tornado, using python to write api doc!

SwaggerDoc About A swagger tool for tornado, using python to write api doc! Installation pip install swagger-doc Quick Start code import tornado.ioloo

aaashuai 1 Jan 10, 2022
Repository for learning Python (Python Tutorial)

Repository for learning Python (Python Tutorial) Languages and Tools 🧰 Overview 📑 Repository for learning Python (Python Tutorial) Languages and Too

Swiftman 2 Aug 22, 2022
NetBox plugin that stores configuration diffs and checks templates compliance

Config Officer - NetBox plugin NetBox plugin that deals with Cisco device configuration (collects running config from Cisco devices, indicates config

77 Dec 21, 2022
AiiDA plugin for the HyperQueue metascheduler.

aiida-hyperqueue WARNING: This plugin is still in heavy development. Expect bugs to pop up and the API to change. AiiDA plugin for the HyperQueue meta

AiiDA team 3 Jun 19, 2022
DocumentPy is a Python application that runs in a command-line interface environment, made for creating HTML documents.

DocumentPy DocumentPy is a Python application that runs in a command-line interface environment, made for creating HTML documents. Usage DocumentPy, a

Lotus 0 Jul 15, 2021
Markdown documentation generator from Google docstrings

mkgendocs A Python package for automatically generating documentation pages in markdown for Python source files by parsing Google style docstring. The

Davide Nunes 44 Dec 18, 2022
Fully typesafe, Rust-like Result and Option types for Python

safetywrap Fully typesafe, Rust-inspired wrapper types for Python values Summary This library provides two main wrappers: Result and Option. These typ

Matthew Planchard 32 Dec 25, 2022
Convert excel xlsx file's table to csv file, A GUI application on top of python/pyqt and other opensource softwares.

Convert excel xlsx file's table to csv file, A GUI application on top of python/pyqt and other opensource softwares.

David A 0 Jan 20, 2022
🌱 Complete API wrapper of Seedr.cc

Python API Wrapper of Seedr.cc Table of Contents Installation How I got the API endpoints? Start Guide Getting Token Logging with Username and Passwor

Hemanta Pokharel 43 Dec 26, 2022
YAML metadata extension for Python-Markdown

YAML metadata extension for Python-Markdown This extension adds YAML meta data handling to markdown with all YAML features. As in the original, metada

Nikita Sivakov 14 Dec 30, 2022
Elliptic curve cryptography (ed25519) beginner tutorials in Python 3

ed25519_tutorials Elliptic curve cryptography (ed25519) beginner tutorials in Python 3 Instructions Just download the repo and read the tutorial files

6 Dec 27, 2022
Get link preview of a website.

Preview Link You may have seen a preview of a link with a title, image, domain, and description when you share a link on social media. This preview ha

SREEHARI K.V 8 Jan 08, 2023
Clases y ejercicios del curso de python diactodo por la UNSAM

Programación en Python En el marco del proyecto de Inteligencia Artificial Interdisciplinaria, la Escuela de Ciencia y Tecnología de la UNSAM vuelve a

Maximiliano Villalva 3 Jan 06, 2022
30 days of Python programming challenge is a step-by-step guide to learn the Python programming language in 30 days

30 days of Python programming challenge is a step-by-step guide to learn the Python programming language in 30 days. This challenge may take more than100 days, follow your own pace.

Asabeneh 17.7k Jan 07, 2023
Some code that takes a pipe-separated input and converts that into a table!

tablemaker A program that takes an input: a | b | c # With comments as well. e | f | g h | i |jk And converts it to a table: ┌───┬───┬────┐ │ a │ b │

CodingSoda 2 Aug 30, 2022
Contains the assignments from the course Building a Modern Computer from First Principles: From Nand to Tetris.

Contains the assignments from the course Building a Modern Computer from First Principles: From Nand to Tetris.

Matheus Rodrigues 1 Jan 20, 2022
Assignments from Launch X's python introduction course

Launch X - On Boarding Assignments from Launch X's Python Introduction Course Explore the docs » Report Bug · Request Feature Table of Contents About

Javier Méndez 0 Mar 15, 2022
Generate a backend and frontend stack using Python and json-ld, including interactive API documentation.

d4 - Base Project Generator Generate a backend and frontend stack using Python and json-ld, including interactive API documentation. d4? What is d4 fo

Markus Leist 3 May 03, 2022