Code for SIMMC 2.0: A Task-oriented Dialog Dataset for Immersive Multimodal Conversations

Related tags

Deep Learningsimmc2
Overview

The Second Situated Interactive MultiModal Conversations (SIMMC 2.0) Challenge 2021

Welcome to the Second Situated Interactive Multimodal Conversations (SIMMC 2.0) Track for DSTC10 2021.

The SIMMC challenge aims to lay the foundations for the real-world assistant agents that can handle multimodal inputs, and perform multimodal actions. Similar to the First SIMMC challenge (as part of DSTC9), we focus on the task-oriented dialogs that encompass a situated multimodal user context in the form of a co-observed & immersive virtual reality (VR) environment. The conversational context is dynamically updated on each turn based on the user actions (e.g. via verbal interactions, navigation within the scene). For this challenge, we release a new Immersive SIMMC 2.0 dataset in the shopping domains: furniture and fashion.

Organizers: Seungwhan Moon, Satwik Kottur, Paul A. Crook, Ahmad Beirami, Babak Damavandi, Alborz Geramifard

Example from SIMMC

Example from SIMMC-Furniture Dataset

Latest News

  • [June 14, 2021] Challenge announcement. Training / development datasets (SIMMC v2.0) are released.

Important Links

Timeline

Date Milestone
June 14, 2021 Training & development data released
Sept 24, 2021 Test-Std data released, End of Challenge Phase 1
Oct 1, 2021 Entry submission deadline, End of Challenge Phase 2
Oct 8, 2021 Final results announced

Track Description

Tasks and Metrics

We present four sub-tasks primarily aimed at replicating human-assistant actions in order to enable rich and interactive shopping scenarios.

Sub-Task #1 Multimodal Disambiguation
Goal To classify if the assistant should disambiguate in the next turn
Input Current user utterance, Dialog context, Multimodal context
Output Binary label
Metrics Binary classification accuracy
Sub-Task #2 Multimodal Coreference Resolution
Goal To resolve referent objects to thier canonical ID(s) as defined by the catalog.
Input Current user utterance with objection mentions, Dialog context, Multimodal context
Output Canonical object IDs
Metrics Coref F1 / Precision / Recall
Sub-Task #3 Multimodal Dialog State Tracking (MM-DST)
Goal To track user belief states across multiple turns
Input Current user utterance, Dialogue context, Multimodal context
Output Belief state for current user utterance
Metrics Slot F1, Intent F1
Sub-Task #4 Multimodal Dialog Response Generation & Retrieval
Goal To generate Assistant responses or retrieve from a candidate pool
Input Current user utterance, Dialog context, Multimodal context, (Ground-truth API Calls)
Output Assistant response utterance
Metrics Generation: BLEU-4, Retrieval: MRR, [email protected], [email protected], [email protected], Mean Rank

Please check the task input file for a full description of inputs for each subtask.

Evaluation

For the DSTC10 SIMMC Track, we will do a two phase evaluation as follows.

Challenge Period 1: Participants will evaluate the model performance on the provided devtest set. At the end of Challenge Period 1 (Sept 24), we ask participants to submit their model prediction results and a link to their code repository.

Challenge Period 2: A test-std set will be released on Sept 28 for the participants who submitted the results for the Challenge Period 1. We ask participants to submit their model predictions on the test-std set by Oct 1. We will announce the final results and the winners on Oct 8.

Challenge Instructions

(1) Challenge Registration

  • Fill out this form to register at DSTC10. Check “Track 3: SIMMC 2.0: Situated Interactive Multimodal Conversational AI” along with other tracks you are participating in.

(2) Download Datasets and Code

  • Irrespective of participation in the challenge, we'd like to encourge those interested in this dataset to complete this optional survey. This will also help us communicate any future updates on the codebase, the datasets, and the challenge track.

  • Git clone our repository to download the datasets and the code. You may use the provided baselines as a starting point to develop your models.

$ git lfs install
$ git clone https://github.com/facebookresearch/simmc2.git

(3) Reporting Results for Challenge Phase 1

  • Submit your model prediction results on the devtest set, following the submission instructions.
  • We will release the test-std set (with ground-truth labels hidden) on Sept 24.

(4) Reporting Results for Challenge Phase 2

  • Submit your model prediction results on the test-std set, following the submission instructions.
  • We will evaluate the participants’ model predictions using the same evaluation script for Phase 1, and announce the results.

Contact

Questions related to SIMMC Track, Data, and Baselines

Please contact [email protected], or leave comments in the Github repository.

DSTC Mailing List

If you want to get the latest updates about DSTC10, join the DSTC mailing list.

Citations

If you want to publish experimental results with our datasets or use the baseline models, please cite the following articles:

@article{kottur2021simmc,
  title={SIMMC 2.0: A Task-oriented Dialog Dataset for Immersive Multimodal Conversations},
  author={Kottur, Satwik and Moon, Seungwhan and Geramifard, Alborz and Damavandi, Babak},
  journal={arXiv preprint arXiv:2104.08667},
  year={2021}
}

NOTE: The paper above describes in detail the datasets, the collection process, and some of the baselines we provide in this challenge. The paper reports the results from an earlier version of the dataset and with different train-dev-test splits, hence the baseline performances on the challenge resources will be slightly different.

License

SIMMC 2.0 is released under CC-BY-NC-SA-4.0, see LICENSE for details.

Owner
Facebook Research
Facebook Research
Drone Task1 - Drone Task1 With Python

Drone_Task1 Matching Results 3.mp4 1.mp4

MLV Lab (Machine Learning and Vision Lab at Korea University) 11 Nov 14, 2022
torchbearer: A model fitting library for PyTorch

Note: We're moving to PyTorch Lightning! Read about the move here. From the end of February, torchbearer will no longer be actively maintained. We'll

631 Jan 04, 2023
Repo for CReST: A Class-Rebalancing Self-Training Framework for Imbalanced Semi-Supervised Learning

CReST in Tensorflow 2 Code for the paper: "CReST: A Class-Rebalancing Self-Training Framework for Imbalanced Semi-Supervised Learning" by Chen Wei, Ki

Google Research 75 Nov 01, 2022
Code and data for ACL2021 paper Cross-Lingual Abstractive Summarization with Limited Parallel Resources.

Multi-Task Framework for Cross-Lingual Abstractive Summarization (MCLAS) The code for ACL2021 paper Cross-Lingual Abstractive Summarization with Limit

Yu Bai 43 Nov 07, 2022
A trashy useless Latin programming language written in python.

Codigum! The first programming langage in latin! (please keep your eyes closed when if you read the source code) It is pretty useless though. Document

Bic 2 Oct 25, 2021
Open-source Monocular Python HawkEye for Tennis

Tennis Tracking 🎾 Objectives Track the ball Detect court lines Detect the players To track the ball we used TrackNet - deep learning network for trac

ArtLabs 188 Jan 08, 2023
Face Alignment using python

Face Alignment Face Alignment using python Input Image Aligned Face Aligned Face Aligned Face Input Image Aligned Face Input Image Aligned Face Instal

Sajjad Aemmi 28 Nov 23, 2022
Official code release for ICCV 2021 paper SNARF: Differentiable Forward Skinning for Animating Non-rigid Neural Implicit Shapes.

Official code release for ICCV 2021 paper SNARF: Differentiable Forward Skinning for Animating Non-rigid Neural Implicit Shapes.

235 Dec 26, 2022
U^2-Net - Portrait matting This repository explores possibilities of using the original u^2-net model for portrait matting.

U^2-Net - Portrait matting This repository explores possibilities of using the original u^2-net model for portrait matting.

Dennis Bappert 104 Nov 25, 2022
High performance, easy-to-use, and scalable machine learning (ML) package, including linear model (LR), factorization machines (FM), and field-aware factorization machines (FFM) for Python and CLI interface.

What is xLearn? xLearn is a high performance, easy-to-use, and scalable machine learning package that contains linear model (LR), factorization machin

Chao Ma 3k Jan 03, 2023
Level Based Customer Segmentation

level_based_customer_segmentation Level Based Customer Segmentation Persona Veri Seti kullanılarak müşteri segmentasyonu yapılmıştır. KOLONLAR : PRICE

Buse Yıldırım 6 Dec 21, 2021
A toolset of Python programs for signal modeling and indentification via sparse semilinear autoregressors.

SPAAR Description A toolset of Python programs for signal modeling via sparse semilinear autoregressors. References Vides, F. (2021). Computing Semili

Fredy Vides 0 Oct 30, 2021
competitions-v2

Codabench (formerly Codalab Competitions v2) Installation $ cp .env_sample .env $ docker-compose up -d $ docker-compose exec django ./manage.py migrat

CodaLab 21 Dec 02, 2022
Camera calibration & 3D pose estimation tools for AcinoSet

AcinoSet: A 3D Pose Estimation Dataset and Baseline Models for Cheetahs in the Wild Daniel Joska, Liam Clark, Naoya Muramatsu, Ricardo Jericevich, Fre

African Robotics Unit 42 Nov 16, 2022
Implementation of popular SOTA self-supervised learning algorithms as Fastai Callbacks.

Self Supervised Learning with Fastai Implementation of popular SOTA self-supervised learning algorithms as Fastai Callbacks. Install pip install self-

Kerem Turgutlu 276 Dec 23, 2022
This repository contains implementations and illustrative code to accompany DeepMind publications

DeepMind Research This repository contains implementations and illustrative code to accompany DeepMind publications. Along with publishing papers to a

DeepMind 11.3k Dec 31, 2022
A new data augmentation method for extreme lighting conditions.

Random Shadows and Highlights This repo has the source code for the paper: Random Shadows and Highlights: A new data augmentation method for extreme l

Osama Mazhar 35 Nov 26, 2022
HashNeRF-pytorch - Pure PyTorch Implementation of NVIDIA paper on Instant Training of Neural Graphics primitives

HashNeRF-pytorch Instant-NGP recently introduced a Multi-resolution Hash Encodin

Yash Sanjay Bhalgat 616 Jan 06, 2023
Geometry-Free View Synthesis: Transformers and no 3D Priors

Geometry-Free View Synthesis: Transformers and no 3D Priors Geometry-Free View Synthesis: Transformers and no 3D Priors Robin Rombach*, Patrick Esser*

CompVis Heidelberg 293 Dec 22, 2022