Code and data of the Fine-Grained R2R Dataset proposed in paper Sub-Instruction Aware Vision-and-Language Navigation

Overview

Fine-Grained R2R

Code and data of the Fine-Grained R2R Dataset proposed in the EMNLP2020 paper Sub-Instruction Aware Vision-and-Language Navigation.

Code of the navigator will be released soon.

This dataset enriches the benchmark Room-to-Room (R2R) dataset by dividing the instructions into sub-instructions and pairing each of those with their corresponding viewpoints in the path.

  • The copyright resides with the authors of the paper Sub-Instruction Aware Vision-and-Language Navigation.
  • This dataset is build upon the Room-to-Room (R2R) dataset, we refer the readers to its repository for more details.

Data

The Fine-Grained R2R data, which enriches the R2R dataset with sub-instructions and their corresponding paths. The overall instruction and trajectory of each sample remains the same.

  • For paths in the train, the validation seen and the validation unseen splits, we add two new entries:

    • new_instructions: A list of sub-instructions produced by the Chunking Function from the complete instructions. You can use import ast and ast.literal_eval() to read it a list.
    • chunk_view: A list of sub-paths corresponding to the sub-instructions, where each number in the list is an index of a viewpoint in the ground-truth path. The index starts at 1.
  • Some sub-instructions which refer to camera rotation or a STOP action could match to a single viewpoint.

  • For the test unseen split, we only provide the sub-instructions but not the sub-paths.

Source

The code of the proposed Chunking Function for generating sub-instructions.

  • Install the StanfordNLP package (v0.1.2 in our experiment) and download the English models for the neural pipeline.

  • Run make_subinstr.py to generate data with sub-instructions from the original R2R data.

  • The generated files had been sent to the Amazon Mechanical Turk (AMT) for annotating the sub-paths.

Reference

If you use or dicsuss the Fine-Grained R2R dataset in your work, please cite our paper:

@article{hong2020sub,
  title={Sub-Instruction Aware Vision-and-Language Navigation},
  author={Hong, Yicong and Rodriguez-Opazo, Cristian and Wu, Qi and Gould, Stephen},
  journal={arXiv preprint arXiv:2004.02707},
  year={2020}
}

Contact

If you have any question regarding the dataset or publication, please create an issue in this repository or email to [email protected].

Owner
YicongHong
I don't even know where is the end of our universe, how am I suppose to know that?
YicongHong
PyTorch implementation for our paper Learning Character-Agnostic Motion for Motion Retargeting in 2D, SIGGRAPH 2019

Learning Character-Agnostic Motion for Motion Retargeting in 2D We provide PyTorch implementation for our paper Learning Character-Agnostic Motion for

Rundi Wu 367 Dec 22, 2022
Realtime segmentation with ENet, the fast and accurate segmentation net.

Enet This is a realtime segmentation net with almost 22 fps on GTX1080 ti, and the model size is very small with only 28M. This repo contains the infe

JinTian 14 Aug 30, 2022
Repository aimed at compiling code, papers, demos etc.. related to my PhD on 3D vision and machine learning for fruit detection and shape estimation at the university of Lincoln

PhD_3DPerception Repository aimed at compiling code, papers, demos etc.. related to my PhD on 3D vision and machine learning for fruit detection and s

lelouedec 2 Oct 06, 2022
This is a classifier which basically predicts whether there is a gun law in a state or not, depending on various things like murder rates etc.

Gun-Laws-Classifier This is a classifier which basically predicts whether there is a gun law in a state or not, depending on various things like murde

Awais Saleem 1 Jan 20, 2022
PerfFuzz: Automatically Generate Pathological Inputs for C/C++ programs

PerfFuzz Performance problems in software can arise unexpectedly when programs are provided with inputs that exhibit pathological behavior. But how ca

Caroline Lemieux 125 Nov 18, 2022
Open standard for machine learning interoperability

Open Neural Network Exchange (ONNX) is an open ecosystem that empowers AI developers to choose the right tools as their project evolves. ONNX provides

Open Neural Network Exchange 13.9k Dec 30, 2022
Pytorch code for ICRA'21 paper: "Hierarchical Cross-Modal Agent for Robotics Vision-and-Language Navigation"

Hierarchical Cross-Modal Agent for Robotics Vision-and-Language Navigation This repository is the pytorch implementation of our paper: Hierarchical Cr

43 Nov 21, 2022
A deep learning library that makes face recognition efficient and effective

Distributed Arcface Training in Pytorch This is a deep learning library that makes face recognition efficient, and effective, which can train tens of

Sajjad Aemmi 10 Nov 23, 2021
Physics-Aware Training (PAT) is a method to train real physical systems with backpropagation.

Physics-Aware Training (PAT) is a method to train real physical systems with backpropagation. It was introduced in Wright, Logan G. & Onodera, Tatsuhiro et al. (2021)1 to train Physical Neural Networ

McMahon Lab 230 Jan 05, 2023
Azua - build AI algorithms to aid efficient decision-making with minimum data requirements.

Project Azua 0. Overview Many modern AI algorithms are known to be data-hungry, whereas human decision-making is much more efficient. The human can re

Microsoft 197 Jan 06, 2023
Official implementation of CATs: Cost Aggregation Transformers for Visual Correspondence NeurIPS'21

CATs: Cost Aggregation Transformers for Visual Correspondence NeurIPS'21 For more information, check out the paper on [arXiv]. Training with different

Sunghwan Hong 120 Jan 04, 2023
Implementation of Deformable Attention in Pytorch from the paper "Vision Transformer with Deformable Attention"

Deformable Attention Implementation of Deformable Attention from this paper in Pytorch, which appears to be an improvement to what was proposed in DET

Phil Wang 128 Dec 24, 2022
Generative Flow Networks for Discrete Probabilistic Modeling

Energy-based GFlowNets Code for Generative Flow Networks for Discrete Probabilistic Modeling by Dinghuai Zhang, Nikolay Malkin, Zhen Liu, Alexandra Vo

Narsil-Dinghuai Zhang 51 Dec 20, 2022
Representing Long-Range Context for Graph Neural Networks with Global Attention

Graph Augmentation Graph augmentation/self-supervision/etc. Algorithms gcn gcn+virtual node gin gin+virtual node PNA GraphTrans Augmentation methods N

UC Berkeley RISE 67 Dec 30, 2022
Implementation for Homogeneous Unbalanced Regularized Optimal Transport

HUROT: An Homogeneous formulation of Unbalanced Regularized Optimal Transport. This repository provides code related to this preprint. This is an alph

Théo Lacombe 1 Feb 17, 2022
BASH - Biomechanical Animated Skinned Human

We developed a method animating a statistical 3D human model for biomechanical analysis to increase accessibility for non-experts, like patients, athletes, or designers.

Machine Learning and Data Analytics Lab FAU 66 Nov 19, 2022
Language Models for the legal domain in Spanish done @ BSC-TEMU within the "Plan de las Tecnologías del Lenguaje" (Plan-TL).

Spanish legal domain Language Model ⚖️ This repository contains the page for two main resources for the Spanish legal domain: A RoBERTa model: https:/

Plan de Tecnologías del Lenguaje - Gobierno de España 12 Nov 14, 2022
AdaDM: Enabling Normalization for Image Super-Resolution

AdaDM AdaDM: Enabling Normalization for Image Super-Resolution. You can apply BN, LN or GN in SR networks with our AdaDM. Pretrained models (EDSR*/RDN

58 Jan 08, 2023
A PyTorch implementation for PyramidNets (Deep Pyramidal Residual Networks)

A PyTorch implementation for PyramidNets (Deep Pyramidal Residual Networks) This repository contains a PyTorch implementation for the paper: Deep Pyra

Greg Dongyoon Han 262 Jan 03, 2023
The fastai deep learning library

Welcome to fastai fastai simplifies training fast and accurate neural nets using modern best practices Important: This documentation covers fastai v2,

fast.ai 23.2k Jan 07, 2023