Simple data balancing baselines for worst-group-accuracy benchmarks.

Last update: Dec 02, 2022

Related tags

Overview

BalancingGroups

Code to replicate the experimental results from Simple data balancing baselines achieve competitive worst-group-accuracy.

Replicating the main results

Installing dependencies

Easiest way to have a working environment for this repo is to create a conda environement with the following commands

conda env create -f environment.yaml
conda activate balancinggroups

If conda is not available, please install the dependencies listed in the requirements.txt file.

Download, extract and Generate metadata for datasets

This script downloads, extracts and formats the datasets metadata so that it works with the rest of the code out of the box.

python setup_datasets.py --download --data_path data

Launch jobs

To reproduce the experiments in the paper on a SLURM cluster :

# Launching 1400 combo seeds = 50 hparams for 4 datasets for 7 algorithms
# Each combo seed is ran 5 times to compute error bars, totalling 7000 jobs
python train.py --data_path data --output_dir main_sweep --num_hparams_seeds 1400 --num_init_seeds 5 --partition <slurm_partition>

If you want to run the jobs localy, omit the --partition argument.

Parse results

The parse.py script can generate all of the plots and tables from the paper. By default, it generates the best test worst-group-accuracy table for each dataset/method. This script can be called while the experiments are still running.

python parse.py main_sweep

License

This source code is released under the CC-BY-NC license, included here.

Simple data balancing baselines for worst-group-accuracy benchmarks.

Related tags

Overview

BalancingGroups

Replicating the main results

Installing dependencies

Download, extract and Generate metadata for datasets

Launch jobs

Parse results

License

Owner

Meta Research

Next-Best-View Estimation based on Deep Reinforcement Learning for Active Object Classification

NP DRAW paper released code

Diffgram - Supervised Learning Data Platform

This repository contains a toolkit for collecting, labeling and tracking object keypoints

Single Red Blood Cell Hydrodynamic Traps Via the Generative Design

Hierarchical Time Series Forecasting with a familiar API

The Python code for the paper A Hybrid Quantum-Classical Algorithm for Robust Fitting

Gans-in-action - Companion repository to GANs in Action: Deep learning with Generative Adversarial Networks

PyTorch source code for Distilling Knowledge by Mimicking Features

Official implementation of "Watermarking Images in Self-Supervised Latent-Spaces"

Skyformer: Remodel Self-Attention with Gaussian Kernel and Nystr\"om Method (NeurIPS 2021)

UMPNet: Universal Manipulation Policy Network for Articulated Objects

This is the official implementation of the paper "Object Propagation via Inter-Frame Attentions for Temporally Stable Video Instance Segmentation".

PyTorch implementation of "Simple and Deep Graph Convolutional Networks"

Phy-Q: A Benchmark for Physical Reasoning

Official code for MPG2: Multi-attribute Pizza Generator: Cross-domain Attribute Control with Conditional StyleGAN

Best practices for segmentation of the corporate network of any company

Continuous Security Group Rule Change Detection & Response at scale

Spatial-Location-Constraint-Prototype-Loss-for-Open-Set-Recognition

Paper: De-rendering Stylized Texts