Few-shot Learning of GPT-3

Overview

Few-shot Learning With Language Models

This is a codebase to perform few-shot "in-context" learning using language models similar to the GPT-3 paper. In particular, a few training examples are placed into a natural language "prompt" and predictions are made by generating from the language model. See the GPT-3 paper and Calibrate Before Use for more information.

You can run this codebase with GPT-3 (if you have a key from OpenAI), GPT-2, and any other language model available in HuggingFace Transformers. If you have a GPT-3 key, you should place your API key into a file named openai_key.txt. The underlying model you use is abstracted away using a common API.

Running this codebase will report results with and without contextual calibration.

Dependencies

This code is written using PyTorch and HuggingFace's Transformer repo. If you are running a model locally (e.g., GPT-2), the code requires a single GPU. Running these experiments is relatively lightweight (there is no training), so a single GPU is sufficient. It is technically possible to run the experiments without a GPU, but the runtime will be slow.

Installation

The easiest way to install the code is to create a fresh anaconda environment:

conda create -n fewshot python=3.6
source activate fewshot
pip install -r requirements.txt

Now you should be ready to go!

Replicating Our Results

Here is how to replicate the results from our paper for GPT-2. To replicate the results for classification tasks:

CUDA_VISIBLE_DEVICES=0 python run_classification.py \
--model="gpt2-xl" \
--dataset="sst2, trec, cb, agnews, dbpedia" \
--num_seeds=5 \
--all_shots="0, 1, 4, 8" \
--subsample_test_set=300 \
--approx

To replicate the results for extraction tasks:

CUDA_VISIBLE_DEVICES=0 python run_extraction.py \
--model="gpt2-xl" \
--dataset="mit_movie_Genre, mit_movie_Director, atis_airline_name, atis_depart_date.day_name" \
--num_seeds=5 \
--all_shots="0, 1, 4, 8" \
--subsample_test_set=300

To replicate the results for LAMA:

CUDA_VISIBLE_DEVICES=0 python run_lama.py

Note that after we refactored our code, the training sets are not the same ones used in our results table. We expect the results to differ slightly but they should match the same trends seen in our results.

Overview of Codebase

Data

The data folder contains the raw data for numerous tasks. If you'd like to add your own task, add the data into that folder. The code for loading a dataset, as well as defining the prompt format for a task, is in utils/data_utils.py. We have loaders for a wide range of existing datasets. If you want to add a new dataset that is similar in structure to any of the existing datasets (e.g., its text classification) adding it should be very simple---you can use an existing dataset as a guide.

Utils

The utils folder contains all of the code for calling the underlying models, getting the probabilities of each label token, possibly applying contextual calibration, and more. If you just want to evaluate few-shot learning on your task, you should not need to modify this code. If you want to extend our code (e.g., modify how decisions are made) this is the place to look.

Run Scripts

The run scripts, e.g., run_classification.py, contain the code for randomly sampling the examples to use in the prompt, calling the models, the necessary evaluation metrics, and more. If you are adding a new task format (one that is not classification, QA) then you will need to write your own run script. Inside the run script, you can set the parameters for the experiments using the command line arguments.

For all experiments, we save and pickle the outputs of the model. This makes doing a post-hoc analysis of the accuracy / plotting results / etc. very fast. You can also use the saved outputs to evaluate how the accuracy would have changed if a different decision making function was used (e.g., accuracy with and without contextual calibration).

References

Please consider citing our work if you found this code or our paper beneficial to your research.

@article{Zhao2021Calibrate,	
  Author = {Tony Z. Zhao and Eric Wallace and Shi Feng and Dan Klein and Sameer Singh},	
  Journal={arXiv preprint arXiv:2102.09690},	
  Year = {2021},	
  Title = {Calibrate Before Use: Improving Few-shot Performance of Language Models}	
}    	

Contributions and Contact

This code was developed by Tony Z. Zhao and Eric Wallace, contact available at [email protected] and [email protected].

If you'd like to contribute code, feel free to open a pull request. If you find an issue, please open an issue.

Owner
Tony Z. Zhao
UC Berkeley EECS, working on robotics, NLP and ML
Tony Z. Zhao
The code release of paper Low-Light Image Enhancement with Normalizing Flow

[AAAI 2022] Low-Light Image Enhancement with Normalizing Flow Paper | Project Page Low-Light Image Enhancement with Normalizing Flow Yufei Wang, Renji

Yufei Wang 176 Jan 06, 2023
TrackFormer: Multi-Object Tracking with Transformers

TrackFormer: Multi-Object Tracking with Transformers This repository provides the official implementation of the TrackFormer: Multi-Object Tracking wi

Tim Meinhardt 321 Dec 29, 2022
Code for reproducing our analysis in the paper titled: Image Cropping on Twitter: Fairness Metrics, their Limitations, and the Importance of Representation, Design, and Agency

Image Crop Analysis This is a repo for the code used for reproducing our Image Crop Analysis paper as shared on our blog post. If you plan to use this

Twitter Research 239 Jan 02, 2023
Centroid-UNet is deep neural network model to detect centroids from satellite images.

Centroid UNet - Locating Object Centroids in Aerial/Serial Images Introduction Centroid-UNet is deep neural network model to detect centroids from Aer

GIC-AIT 19 Dec 08, 2022
PyTorch implementation of "Learn to Dance with AIST++: Music Conditioned 3D Dance Generation."

Learn to Dance with AIST++: Music Conditioned 3D Dance Generation. Installation pip install -r requirements.txt Prepare Dataset bash data/scripts/pre

Zj Li 8 Sep 07, 2021
What can linearized neural networks actually say about generalization?

What can linearized neural networks actually say about generalization? This is the source code to reproduce the experiments of the NeurIPS 2021 paper

gortizji 11 Dec 09, 2022
OOD Dataset Curator and Benchmark for AI-aided Drug Discovery

🔥 DrugOOD 🔥 : OOD Dataset Curator and Benchmark for AI Aided Drug Discovery This is the official implementation of the DrugOOD project, this is the

108 Dec 17, 2022
Back to Event Basics: SSL of Image Reconstruction for Event Cameras

Back to Event Basics: SSL of Image Reconstruction for Event Cameras Minimal code for Back to Event Basics: Self-Supervised Learning of Image Reconstru

TU Delft 42 Dec 26, 2022
Rename Images with Auto Generated Neural Image Captions

Recaption Images with Generated Neural Image Caption Example Usage: Commandline: Recaption all images from folder /home/feng/Downloads/images to folde

feng wang 3 May 01, 2022
To model the probability of a soccer coach leave his/her team during Campeonato Brasileiro for 10 chosen teams and considering years 2018, 2019 and 2020.

To model the probability of a soccer coach leave his/her team during Campeonato Brasileiro for 10 chosen teams and considering years 2018, 2019 and 2020.

Larissa Sayuri Futino Castro dos Santos 1 Jan 20, 2022
a project for 3D multi-object tracking

a project for 3D multi-object tracking

155 Jan 04, 2023
tf2onnx - Convert TensorFlow, Keras and Tflite models to ONNX.

tf2onnx converts TensorFlow (tf-1.x or tf-2.x), tf.keras and tflite models to ONNX via command line or python api.

Open Neural Network Exchange 1.8k Jan 08, 2023
Measuring Coding Challenge Competence With APPS

Measuring Coding Challenge Competence With APPS This is the repository for Measuring Coding Challenge Competence With APPS by Dan Hendrycks*, Steven B

Dan Hendrycks 218 Dec 27, 2022
The tl;dr on a few notable transformer/language model papers + other papers (alignment, memorization, etc).

The tl;dr on a few notable transformer/language model papers + other papers (alignment, memorization, etc).

Will Thompson 166 Jan 04, 2023
Official implementation for the paper "SAPE: Spatially-Adaptive Progressive Encoding for Neural Optimization".

SAPE Project page Paper Official implementation for the paper "SAPE: Spatially-Adaptive Progressive Encoding for Neural Optimization". Environment Cre

36 Dec 09, 2022
[CVPR 2022] Official code for the paper: "A Stitch in Time Saves Nine: A Train-Time Regularizing Loss for Improved Neural Network Calibration"

MDCA Calibration This is the official PyTorch implementation for the paper: "A Stitch in Time Saves Nine: A Train-Time Regularizing Loss for Improved

MDCA Calibration 21 Dec 22, 2022
Object detection GUI based on PaddleDetection

PP-Tracking GUI界面测试版 本项目是基于飞桨开源的实时跟踪系统PP-Tracking开发的可视化界面 在PaddlePaddle中加入pyqt进行GUI页面研发,可使得整个训练过程可视化,并通过GUI界面进行调参,模型预测,视频输出等,通过多种类型的识别,简化整体预测流程。 GUI界面

杨毓栋 68 Jan 02, 2023
GazeScroller - Using Facial Movements to perform Hands-free Gesture on the system

GazeScroller Using Facial Movements to perform Hands-free Gesture on the system

2 Jan 05, 2022
Weakly Supervised Posture Mining with Reverse Cross-entropy for Fine-grained Classification

Fine-grainedImageClassification Weakly Supervised Posture Mining with Reverse Cross-entropy for Fine-grained Classification We trained model here: lin

ZhenchaoTang 14 Oct 21, 2022
Code repository accompanying the paper "On Adversarial Robustness: A Neural Architecture Search perspective"

On Adversarial Robustness: A Neural Architecture Search perspective Preparation: Clone the repository: https://github.com/tdchaitanya/nas-robustness.g

Chaitanya Devaguptapu 4 Nov 10, 2022