Code Generation using a large neural network called GPT-J

Last update: Dec 31, 2022

Overview

CodeGenX

CodeGenX is a Code Generation system powered by Artificial Intelligence! It is delivered to you in the form of a Visual Studio Code Extension and is Free and Open-source!

Installation

You can find installation instructions and additional information about CodeGenX in the documentation here.

About CodeGenX

1. Languages Supported

CodeGenX currently only supports Python. We are planning to add additional languages in future releases.

2. Modules Trained On

CodeGenX was trained on Python code which covers many of its common uses. Some libraries which CodeGenX is specifically trained on are:

Tensorflow
Pytorch
Scikit-Learn
Pandas
NumPy
OpenCV
Django
Flask
PyGame

3. How CodeGenX Works

At the core of CodeGenX lies a large neural network called GPT-J. GPT-J is a 6 billion parameter transformer model which was trained on hundreds of gigabytes of text from the internet. We fine-tuned this model on a dataset of open-source python code. This fine-tuned model can now be used to generate code when given an input with the right instructions.

Contributors ✨

This project would not have been possible without the help of these wonderful people:

_{Arya Manjaramkar}	_{Matthias Wijnsma}	_{Thomas Houtrique}	_{Dominic Rampas}	_{Bilel Medimegh}	_{Josh Hills}	_Alex
_Tiimo

Acknowledgements

Many thanks to the support of the Google TPU Research Cloud for providing the precious compute needed for this project.

Code Generation using a large neural network called GPT-J

Related tags

Overview

CodeGenX

Installation

About CodeGenX

1. Languages Supported

2. Modules Trained On

3. How CodeGenX Works

Contributors ✨

Acknowledgements

Owner

DeepGenX

The official implementation of VAENAR-TTS, a VAE based non-autoregressive TTS model.

Framework for fine-tuning pretrained transformers for Named-Entity Recognition (NER) tasks

An extension for asreview implements a version of the tf-idf feature extractor that saves the matrix and the vocabulary.

Mysticbbs-rjam - rJAM splitscreen message reader for MysticBBS A46+

A Python package implementing a new model for text classification with visualization tools for Explainable AI :octocat:

XLNet: Generalized Autoregressive Pretraining for Language Understanding

Mapping a variable-length sentence to a fixed-length vector using BERT model

基于pytorch+bert的中文事件抽取

What are the best Systems? New Perspectives on NLP Benchmarking

To be a next-generation DL-based phenotype prediction from genome mutations.

The Internet Archive Research Assistant - Daily search Internet Archive for new items matching your keywords

pyMorfologik MorfologikpyMorfologik - Python binding for Morfologik.

Coreference resolution for English, German and Polish, optimised for limited training data and easily extensible for further languages

"Investigating the Limitations of Transformers with Simple Arithmetic Tasks", 2021

NeMo: a toolkit for conversational AI

The code for the Subformer, from the EMNLP 2021 Findings paper: "Subformer: Exploring Weight Sharing for Parameter Efficiency in Generative Transformers", by Machel Reid, Edison Marrese-Taylor, and Yutaka Matsuo

A PyTorch Implementation of End-to-End Models for Speech-to-Text

Translates basic English sentences into the Huna language (hoo-NAH)

SNCSE: Contrastive Learning for Unsupervised Sentence Embedding with Soft Negative Samples

🏆 • 5050 most frequent words in 109 languages