A GitHub action that suggests type annotations for Python using machine learning.

Last update: Sep 18, 2022

Overview

Typilus: Suggest Python Type Annotations

A GitHub action that suggests type annotations for Python using machine learning.

This action makes suggestions within each pull request as suggested edits. You can then directly apply these suggestions to your code or ignore them.

What are Python type annotations? Introduced in Python 3.5, type hints (more traditionally called type annotations) allow users to annotate their code with the expected types. These annotations are optionally checked by external tools, such as mypy and pyright, to prevent type errors; they also facilitate code comprehension and navigation. The typing module provides the core types.

Why use machine learning? Given the dynamic nature of Python, type inference is challenging, especially over partial contexts. To tackle this challenge, we use a graph neural network model that predicts types by probabilistically reasoning over a program’s structure, names, and patterns. This allows us to make suggestions with only a partial context, at the cost of suggesting some false positives.

Install Action in your Repository

To use the GitHub action, create a workflow file. For example,

name: Typilus Type Annotation Suggestions

# Controls when the action will run. Triggers the workflow on push or pull request
# events but only for the master branch
on:
  pull_request:
    branches: [ master ]

jobs:
  suggest:
    # The type of runner that the job will run on
    runs-on: ubuntu-latest

    steps:
    # Checks-out your repository under $GITHUB_WORKSPACE, so that typilus can access it.
    - uses: actions/[email protected]
    - uses: typilus/[email protected]
      env:
        GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
        MODEL_PATH: path/to/model.pkl.gz   # Optional: provide the path of a custom model instead of the pre-trained model.
        SUGGESTION_CONFIDENCE_THRESHOLD: 0.8   # Configure this to limit the confidence of suggestions on un-annotated locations. A float in [0, 1]. Default 0.8
        DISAGREEMENT_CONFIDENCE_THRESHOLD: 0.95  # Configure this to limit the confidence of suggestions on annotated locations.  A float in [0, 1]. Default 0.95

The action uses the GITHUB_TOKEN to retrieve the diff of the pull request and to post comments on the analyzed pull request.

Technical Details & Internals

This GitHub action is a reimplementation of the Graph2Class model of Allamanis et al. PLDI 2020 using the ptgnn library. Internally, it uses a Graph Neural Network to predict likely type annotations for Python code.

This action uses a pre-trained neural network that has been trained on a corpus of open-source repositories that use Python's type annotations. At this point we do not support online adaptation of the model to each project.

Training your own model

You may wish to train your own model and use it in this action. To do so, please follow the steps in ptgnn. Then provide a path to the model in your GitHub action configuration, through the MODEL_PATH environment variable.

Contributing

We welcome external contributions and ideas. Please look at the issues in the repository for ideas and improvements.

You might also like...

30 Days Of Machine Learning Using Pytorch

Microsoft contributing libraries, tools, recipes, sample codes and workshop contents for machine learning & deep learning.

366 Jan 3, 2023

A data preprocessing package for time series data. Design for machine learning and deep learning.

152 Jan 7, 2023

Comments

IndexError: list index out of range

Diff GET Status Code:  200
Traceback (most recent call last):
  File "/usr/src/entrypoint.py", line 81, in <module>
    changed_files = get_changed_files(diff_rq.text)
  File "/usr/src/changeutils.py", line 38, in get_changed_files
    assert file_diff_lines[3].startswith("---")
IndexError: list index out of range

logs_302.zip

opened by ZdenekM 1

Several small fixes
Here are couple of things I noticed trying Typilus inference using GH Action:

gracefully handle patches that include a file renames (\wo any content modifications) by skipping such files

extractor stats reporting only processed files
opened by bzz 0
Create a ptgnn-based Typilus model
Create and use the full Typilus model instead of graph2class.

[ ] Implement it in ptgnn

[ ] Use action cache to store intermediate result

[ ] Auto-update type space "once in a while"

enhancement
opened by mallamanis 0

A GitHub action that suggests type annotations for Python using machine learning.

Related tags

Overview

Typilus: Suggest Python Type Annotations

Install Action in your Repository

Technical Details & Internals

Training your own model

Contributing

You might also like...

30 Days Of Machine Learning Using Pytorch

customer churn prediction prevention in telecom industry using machine learning and survival analysis

using Machine Learning Algorithm to classification AppleStore application

CrayLabs and user contibuted examples of using SmartSim for various simulation and machine learning applications.

Backtesting an algorithmic trading strategy using Machine Learning and Sentiment Analysis.

A machine learning web application for binary classification using streamlit

Predicting Keystrokes using an Audio Side-Channel Attack and Machine Learning

Microsoft contributing libraries, tools, recipes, sample codes and workshop contents for machine learning & deep learning.

A data preprocessing package for time series data. Design for machine learning and deep learning.

Comments

IndexError: list index out of range

Several small fixes

Create a ptgnn-based Typilus model

Releases(v0.9)

v0.9(May 18, 2020)

v0.2(May 18, 2020)

v0.1(May 13, 2020)

Owner

Traingenerator 🧙 A web app to generate template code for machine learning ✨

ML Optimizers from scratch using JAX

Data science, Data manipulation and Machine learning package.

Time Series Prediction with tf.contrib.timeseries

Python library for multilinear algebra and tensor factorizations

A simple machine learning python sign language detection project.

ETNA is an easy-to-use time series forecasting framework.

Simple and flexible ML workflow engine.

TensorFlow implementation of an arbitrary order Factorization Machine

A logistic regression model for health insurance purchasing prediction

Educational python for Neural Networks, written in pure Python/NumPy.

MLR - Machine Learning Research

scikit-multimodallearn is a Python package implementing algorithms multimodal data.

A Lightweight Hyperparameter Optimization Tool 🚀

A repository for collating all the resources such as articles, blogs, papers, and books related to Bayesian Statistics.

MLFlow in a Dockercontainer based on Azurite and Postgres

Getting Profit and Loss Make Easy From Binance

Data Efficient Decision Making

BentoML is a flexible, high-performance framework for serving, managing, and deploying machine learning models.

Machine Learning Algorithms