Multi-Objective Reinforced Active Learning

Last update: Nov 19, 2022

Related tags

Deep Learning moral_rl

Overview

Multi-Objective Reinforced Active Learning

Dependencies

wandb
tqdm
pytorch >= 1.7.0
numpy >= 1.20.0
scipy >= 1.1.0
pycolab == 1.2

Weights and Biases

Our code depends on for visualizing and logging results during training. As a result, we call wandb.init(), which will prompt to add an API key for linking the training runs with your personal wandb account. This can be done by pasting the WANDB_API_KEY into the respective box when running the code for the first time.

Environments

Our gridworlds (Emergency: randomized_v2.py, Delivery: randomized_v3.py) build on the game engine with a custom wrapper to provide similar functionality as the gym . This engine comes with a user interface and any environment can be played in the console using python environment.py with arrow keys and w, a, s, d as controls.

Training

There are four training scripts for

manually training a PPO agent on custom rewards (ppo_train.py),
training AIRL on a single expert dataset (airl_train.py),
active MORL with custom/automatic preferences (moral_train.py) and
training DRLHP with custom/automatic preferences (drlhp_train.py).

When using automatic preferences, a desired ratio can be passed as an argument. For example,

python moral_train.py --ratio a b c

will run MORAL using a (real-valued) ratio of a:b:c among the three explicit objectives in Delivery.

Hyperparameters

Hyperparameters are passed as arguments to wandb.init() and can be changed by modifying the respective training files.

Multi-Objective Reinforced Active Learning

Related tags

Overview

Multi-Objective Reinforced Active Learning

Dependencies

Weights and Biases

Environments

Training

Hyperparameters

Owner

Markus Peschl

This repository holds code and data for our PETS'22 article 'From "Onion Not Found" to Guard Discovery'.

Learning Continuous Image Representation with Local Implicit Image Function

EdiBERT is a generative model based on a bi-directional transformer, suited for image manipulation

Python script that analyses the given datasets and comes up with the best polynomial regression representation with the smallest polynomial degree possible

SpeechNAS Better Trade off between Latency and Accuracy for Large Scale Speaker Verification

Improved Fitness Optimization Landscapes for Sequence Design

A repository for generating stylized talking 3D and 3D face

Unofficial implementation of Point-Unet: A Context-Aware Point-Based Neural Network for Volumetric Segmentation

Jittor implementation of PCT:Point Cloud Transformer

The dynamics of representation learning in shallow, non-linear autoencoders

Towards Implicit Text-Guided 3D Shape Generation (CVPR2022)

Implementation of our paper 'RESA: Recurrent Feature-Shift Aggregator for Lane Detection' in AAAI2021.

object recognition with machine learning on Respberry pi

DeepConsensus uses gap-aware sequence transformers to correct errors in Pacific Biosciences (PacBio) Circular Consensus Sequencing (CCS) data.

Virtual hand gesture mouse using a webcam

Self-Learning - Books Papers, Courses & more I have to learn soon

Implementation for Panoptic-PolarNet (CVPR 2021)

This is an example of a reproducible modelling project

Code for the paper BERT might be Overkill: A Tiny but Effective Biomedical Entity Linker based on Residual Convolutional Neural Networks

Speech recognition tool to convert audio to text transcripts, for Linux and Raspberry Pi.