Code2flow generates call graphs for dynamic programming language. Code2flow supports Python, Javascript, Ruby, and PHP.

Overview

code2flow logo

Version 2.2.0 Build passing Coverage 100% License MIT

Code2flow generates call graphs for dynamic programming language. Code2flow supports Python, Javascript, Ruby, and PHP.

The basic algorithm is simple:

  1. Translate your source files into ASTs.
  2. Find all function definitions.
  3. Determine where those functions are called.
  4. Connect the dots.

Code2flow is useful for:

  • Untangling spaghetti code.
  • Identifying orphaned functions.
  • Getting new developers up to speed.

Code2flow will provide a pretty good estimate of your project's structure. No algorithm can generate a perfect call graph for a dynamic language - even less so if that language is duck-typed. See the known limitations in the section below.

(Below: Code2flow running on itself (excl javascript, PHP, & Ruby for clarity))

code2flow running against itself

Installation

pip3 install code2flow

If you don't have it already, you will also need to install graphviz. Installation instructions can be found here.

Usage

To generate a DOT file run something like:

code2flow mypythonfile.py

Or, for javascript:

code2flow myjavascriptfile.js

You can also specify multiple files or import directories:

code2flow project/directory/source_a.js project/directory/source_b.js
code2flow project/directory/*.js
code2flow project/directory --language js

There are a ton of command line options, to see them all, run:

code2flow --help

How code2flow works

Code2flow approximates the structure of projects in dynamic languages. It is not possible to generate a perfect callgraph for a dynamic language.

Detailed algorithm:

  1. Generate an AST of the source code
  2. Recursively separate groups and nodes. Groups are files, modules, or classes. More precisely, groups are namespaces where functions live. Nodes are the functions themselves.
  3. For all nodes, identify function calls in those nodes.
  4. For all nodes, identify in-scope variables. Attempt to connect those variables to specific nodes and groups. This is where there is some ambiguity in the algorithm because it is possible to know the types of variables in dynamic languages. So, instead, heuristics must be used.
  5. For all calls in all nodes, attempt to find a match from the in-scope variables. This will be an edge.
  6. If a definitive match from in-scope variables cannot be found, attempt to find a single match from all other groups and nodes.
  7. Trim orphaned nodes and groups.
  8. Output results.

Why is it impossible to generate a perfect call graph?

Consider this toy example in Python

def func_factory(param):
    if param < .5:
        return func_a
    else:
        return func_b

func = func_factory(important_variable)
func()

We have no way of knowing whether func will point to func_a or func_b until runtime. In practice, ambiguity like this is common and is present in most non-trivial applications.

Known limitations

Code2flow is internally powered by ASTs. Most limitations stem from a token not being named what code2flow expects it to be named.

  • All functions without definitions are skipped. This most often happens when a file is not included.
  • Functions with identical names in different namespaces are (loudly) skipped. E.g. If you have two classes with identically named methods, code2flow cannot distinguish between these and skips them.
  • Imported functions from outside of your project directory (including from standard libraries) which share names with your defined functions may not be handled correctly. Instead when you call the imported function, code2flow will link to your local functions. E.g. if you have a function "search()" and call, "import searcher; searcher.search()", code2flow may link (incorrectly) to your defined function.
  • Anonymous or generated functions are skipped. This includes lambdas and factories.
  • If a function is renamed, either explicitly or by being passed around as a parameter, it will be skipped.

How to contribute

  1. Open an issue: Code2flow is not perfect and there is a lot that can be improved. If you find a problem parsing your source that you can identify with a simplified example, please open an issue.
  2. Create a PR: Even better, if you have a fix for the issue you identified that passes unit tests, please open a PR.
  3. Add a language: While dense, each language implementation is between 250-400 lines of code including comments. If you want to implement another language, the existing implementations can be your guide.

License

Code2flow is licensed under the MIT license. Prior to the rewrite in April 2021, code2flow was licensed under LGPL. The last commit under that license was 24b2cb854c6a872ba6e17409fbddb6659bf64d4c. The April 2021 rewrite was substantial so it's probably reasonable to treat code2flow as completely MIT-licensed.

Acknowledgements

  • In mid-2021, Code2flow was rewritten and two new languages were added. This was prompted and financially supported by the Sider Corporation.
  • The code2flow pip name was graciouly transferred to this project from Dheeraj Nair. He was using it for his own (unrelated) code2flow project.
  • Many others have contributed through bug fixes, cleanups, and identifying issues. Thank you!!!

Unrelated projects

The name, "code2flow", has been used for several unrelated projects. Specifically, the domain, code2flow.com, has no association with this project. I've never heard anything from them and it doesn't appear like they use anything from here.

Feedback / Contact

Please do email! [email protected]

Feature Requests

Email me. At any time, I'm spread thin across a lot of projects so I will, unfortunately, turn down most requests. However, I am open to paid development for compelling features.

Owner
Scott Rogowski
Author of Mongita, Code2Flow, and the FFER. Working on Fastmap - looking for cofounders.
Scott Rogowski
Trace any Python program, anywhere!

lptrace lptrace is strace for Python programs. It lets you see in real-time what functions a Python program is running. It's particularly useful to de

Karim Hamidou 687 Nov 20, 2022
NoPdb: Non-interactive Python Debugger

NoPdb: Non-interactive Python Debugger Installation: pip install nopdb Docs: https://nopdb.readthedocs.io/ NoPdb is a programmatic (non-interactive) d

Ondřej Cífka 67 Oct 15, 2022
Silky smooth profiling for Django

Silk Silk is a live profiling and inspection tool for the Django framework. Silk intercepts and stores HTTP requests and database queries before prese

Jazzband 3.7k Jan 01, 2023
pdb++, a drop-in replacement for pdb (the Python debugger)

pdb++, a drop-in replacement for pdb What is it? This module is an extension of the pdb module of the standard library. It is meant to be fully compat

1k Dec 24, 2022
(OLD REPO) Line-by-line profiling for Python - Current repo ->

line_profiler and kernprof line_profiler is a module for doing line-by-line profiling of functions. kernprof is a convenient script for running either

Robert Kern 3.6k Jan 06, 2023
Dahua Console, access internal debug console and/or other researched functions in Dahua devices.

Dahua Console, access internal debug console and/or other researched functions in Dahua devices.

bashis 156 Dec 28, 2022
🔥 Pyflame: A Ptracing Profiler For Python. This project is deprecated and not maintained.

Pyflame: A Ptracing Profiler For Python (This project is deprecated and not maintained.) Pyflame is a high performance profiling tool that generates f

Uber Archive 3k Jan 07, 2023
Hunter is a flexible code tracing toolkit.

Overview docs tests package Hunter is a flexible code tracing toolkit, not for measuring coverage, but for debugging, logging, inspection and other ne

Ionel Cristian Mărieș 705 Dec 08, 2022
A simple rubber duck debugger

Rubber Duck Debugger I found myself many times asking a question on StackOverflow or to one of my colleagues just for finding the solution simply by d

1 Nov 10, 2021
Little helper to run Steam apps under Proton with a GDB debugger

protongdb A small little helper for running games with Proton and debugging with GDB Requirements At least Python 3.5 protontricks pip package and its

Joshie 21 Nov 27, 2022
Sweeter debugging and benchmarking Python programs.

Do you ever use print() or log() to debug your code? If so, ycecream, or y for short, will make printing debug information a lot sweeter. And on top o

42 Dec 12, 2022
Parsing ELF and DWARF in Python

pyelftools pyelftools is a pure-Python library for parsing and analyzing ELF files and DWARF debugging information. See the User's guide for more deta

Eli Bendersky 1.6k Jan 04, 2023
EDB 以太坊单合约交易调试工具

EDB 以太坊单合约交易调试工具 Idea 在刷题的时候遇到一类JOP(Jump-Oriented-Programming)的题目,fuzz或者调试这类题目缺少简单易用的工具,由此开发了一个简单的调试工具EDB(The Ethereum Debugger),利用debug_traceTransact

16 May 21, 2022
VizTracer is a low-overhead logging/debugging/profiling tool that can trace and visualize your python code execution.

VizTracer is a low-overhead logging/debugging/profiling tool that can trace and visualize your python code execution.

2.8k Jan 08, 2023
Trace all method entries and exits, the exit also prints the return value, if it is of basic type

Trace all method entries and exits, the exit also prints the return value, if it is of basic type. The apk must have set the android:debuggable="true" flag.

Kurt Nistelberger 7 Aug 10, 2022
Arghonaut is an interactive interpreter, visualizer, and debugger for Argh! and Aargh!

Arghonaut Arghonaut is an interactive interpreter, visualizer, and debugger for Argh! and Aargh!, which are Befunge-like esoteric programming language

Aaron Friesen 2 Dec 10, 2021
A powerful set of Python debugging tools, based on PySnooper

snoop snoop is a powerful set of Python debugging tools. It's primarily meant to be a more featureful and refined version of PySnooper. It also includ

Alex Hall 874 Jan 08, 2023
The official code of LM-Debugger, an interactive tool for inspection and intervention in transformer-based language models.

LM-Debugger is an open-source interactive tool for inspection and intervention in transformer-based language models. This repository includes the code

Mor Geva 110 Dec 28, 2022
A gdb-like Python3 Debugger in the Trepan family

Abstract Features More Exact location information Debugging Python bytecode (no source available) Source-code Syntax Colorization Command Completion T

R. Bernstein 126 Nov 24, 2022
printstack is a Python package that adds stack trace links to the builtin print function, so that editors such as PyCharm can link you to the source of the print call.

printstack is a Python package that adds stack trace links to the builtin print function, so that editors such as PyCharm can link to the source of the print call.

101 Aug 26, 2022