The Habitat-Matterport 3D Research Dataset - the largest-ever dataset of 3D indoor spaces.

Overview

Habitat-Matterport 3D Dataset (HM3D)

The Habitat-Matterport 3D Research Dataset is the largest-ever dataset of 3D indoor spaces. It consists of 1,000 high-resolution 3D scans (or digital twins) of building-scale residential, commercial, and civic spaces generated from real-world environments.

HM3D is free and available here for academic, non-commercial research. Researchers can use it with FAIR’s Habitat simulator to train embodied agents, such as home robots and AI assistants, at scale.

example

This repository contains the code and instructions to reproduce experiments from our NeurIPS 2021 paper. If you use the HM3D dataset or the experimental code in your research, please cite the HM3D paper.

@inproceedings{ramakrishnan2021hm3d,
  title={Habitat-Matterport 3D Dataset ({HM}3D): 1000 Large-scale 3D Environments for Embodied {AI}},
  author={Santhosh Kumar Ramakrishnan and Aaron Gokaslan and Erik Wijmans and Oleksandr Maksymets and Alexander Clegg and John M Turner and Eric Undersander and Wojciech Galuba and Andrew Westbury and Angel X Chang and Manolis Savva and Yili Zhao and Dhruv Batra},
  booktitle={Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 2)},
  year={2021},
  url={https://openreview.net/forum?id=-v4OuqNs5P}
}

Please check out our website for details on downloading and visualizing the HM3D dataset.

Installation instructions

We provide a common set of instructions to setup the environment to run all our experiments.

  1. Clone the HM3D github repository and add it to PYTHONPATH.

    git clone https://github.com/facebookresearch/habitat-matterport3d-dataset.git
    cd habitat-matterport3d-dataset
    export PYTHONPATH=$PYTHONPATH:$PWD
    
  2. Create conda environment and activate it.

    conda create -n hm3d python=3.8.3
    conda activate hm3d
    
  3. Install habitat-sim using conda.

    conda install habitat-sim headless -c conda-forge -c aihabitat
    

    See habitat-sim's installation instructions for more details.

  4. Install trimesh with soft dependencies.

    pip install "trimesh[easy]==3.9.1"
    
  5. Install remaining requirements from pip.

    pip install -r requirements.txt
    

Downloading datasets

In our paper, we benchmarked HM3D against prior indoor scene datasets such as Gibson, MP3D, RoboThor, Replica, and ScanNet.

  • Download each dataset based on these instructions from habitat-sim. In the case of RoboThor, convert the raw scan assets to GLB using assimp.

    assimp export  
         
    
         
  • Once the datasets are download and processed, create environment variables pointing to the corresponding scene paths.

    export GIBSON_ROOT=
         
          
    export MP3D_ROOT=
          
           
    export ROBOTHOR_ROOT=
           
            
    export HM3D_ROOT=
            
             
    export REPLICA_ROOT=
             
               export SCANNET_ROOT=
               
              
             
            
           
          
         

Running experiments

We provide the code for reproducing the results from our paper in different directories.

  • scale_comparison contains the code for comparing the scale of HM3D with other datasets (Tab. 1 in the paper).
  • quality_comparison contains the code for comparing the reconstruction completeness and visual fidelity of HM3D with other datasets (Fig. 4 and Tab. 5 in the paper).
  • pointnav_comparison contains the configs and instructions to train and evaluate PointNav agents on HM3D and other datasets (Tab. 2 and Fig. 7 in the paper).

We further provide README files within each directory with instructions for running the corresponding experiments.

Acknowledgements

We thank all the volunteers who contributed to the dataset curation effort: Harsh Agrawal, Sashank Gondala, Rishabh Jain, Shawn Jiang, Yash Kant, Noah Maestre, Yongsen Mao, Abhinav Moudgil, Sonia Raychaudhuri, Ayush Shrivastava, Andrew Szot, Joanne Truong, Madhawa Vidanapathirana, Joel Ye. We thank our collaborators at Matterport for their contributions to the dataset: Conway Chen, Victor Schwartz, Nicole Rogers, Sachal Dhillon, Raghu Munaswamy, Mark Anderson.

License

The code in this repository is MIT licensed. See the LICENSE file for details. The trained models are considered data derived from the correspondent scene datasets.

Owner
Meta Research
Meta Research
MultiTaskLearning - Multi Task Learning for 3D segmentation

Multi Task Learning for 3D segmentation Perception stack of an Autonomous Drivin

2 Sep 22, 2022
Demonstrational Session git repo for H SAF User Workshop (28/1)

5th H SAF User Workshop The 5th H SAF User Workshop supported by EUMeTrain will be held in online in January 24-28 2022. This repository contains inst

H SAF 4 Aug 04, 2022
PyTorch implementation of adversarial patch

adversarial-patch PyTorch implementation of adversarial patch This is an implementation of the Adversarial Patch paper. Not official and likely to hav

Jamie Hayes 172 Nov 29, 2022
DeepLabv3+:Encoder-Decoder with Atrous Separable Convolution语义分割模型在tensorflow2当中的实现

DeepLabv3+:Encoder-Decoder with Atrous Separable Convolution语义分割模型在tensorflow2当中的实现 目录 性能情况 Performance 所需环境 Environment 注意事项 Attention 文件下载 Download

Bubbliiiing 31 Nov 25, 2022
Open-source python package for the extraction of Radiomics features from 2D and 3D images and binary masks.

pyradiomics v3.0.1 Build Status Linux macOS Windows Radiomics feature extraction in Python This is an open-source python package for the extraction of

Artificial Intelligence in Medicine (AIM) Program 842 Dec 28, 2022
Python版OpenCVのTracking APIのサンプルです。DaSiamRPNアルゴリズムまで対応しています。

OpenCV-Object-Tracker-Sample Python版OpenCVのTracking APIのサンプルです。   Requirement opencv-contrib-python 4.5.3.56 or later Algorithm 2021/07/16時点でOpenCVには以

KazuhitoTakahashi 36 Jan 01, 2023
Multi Task Vision and Language

12-in-1: Multi-Task Vision and Language Representation Learning Please cite the following if you use this code. Code and pre-trained models for 12-in-

Facebook Research 712 Dec 19, 2022
Understanding and Improving Encoder Layer Fusion in Sequence-to-Sequence Learning (ICLR 2021)

Understanding and Improving Encoder Layer Fusion in Sequence-to-Sequence Learning (ICLR 2021) Citation Please cite as: @inproceedings{liu2020understan

Sunbow Liu 22 Nov 25, 2022
Pyramid R-CNN: Towards Better Performance and Adaptability for 3D Object Detection

Pyramid R-CNN: Towards Better Performance and Adaptability for 3D Object Detection

61 Jan 07, 2023
Official repository of DeMFI (arXiv.)

DeMFI This is the official repository of DeMFI (Deep Joint Deblurring and Multi-Frame Interpolation). [ArXiv_ver.] Coming Soon. Reference Jihyong Oh a

Jihyong Oh 56 Dec 14, 2022
A resource for learning about deep learning techniques from regression to LSTM and Reinforcement Learning using financial data and the fitness functions of algorithmic trading

A tour through tensorflow with financial data I present several models ranging in complexity from simple regression to LSTM and policy networks. The s

195 Dec 07, 2022
An Official Repo of CVPR '20 "MSeg: A Composite Dataset for Multi-Domain Segmentation"

This is the code for the paper: MSeg: A Composite Dataset for Multi-domain Semantic Segmentation (CVPR 2020, Official Repo) [CVPR PDF] [Journal PDF] J

226 Nov 05, 2022
Code for "Multi-View Multi-Person 3D Pose Estimation with Plane Sweep Stereo"

Multi-View Multi-Person 3D Pose Estimation with Plane Sweep Stereo This repository includes the source code for our CVPR 2021 paper on multi-view mult

Jiahao Lin 66 Jan 04, 2023
Supplementary materials to "Spin-optomechanical quantum interface enabled by an ultrasmall mechanical and optical mode volume cavity" by H. Raniwala, S. Krastanov, M. Eichenfield, and D. R. Englund, 2022

Supplementary materials to "Spin-optomechanical quantum interface enabled by an ultrasmall mechanical and optical mode volume cavity" by H. Raniwala,

Stefan Krastanov 1 Jan 17, 2022
A PyTorch Implementation of ViT (Vision Transformer)

ViT - Vision Transformer This is an implementation of ViT - Vision Transformer by Google Research Team through the paper "An Image is Worth 16x16 Word

Quan Nguyen 7 May 11, 2022
ML models and internal tensors 3D visualizer

The free Zetane Viewer is a tool to help understand and accelerate discovery in machine learning and artificial neural networks. It can be used to ope

Zetane Systems 787 Dec 30, 2022
Beyond Image to Depth: Improving Depth Prediction using Echoes (CVPR 2021)

Beyond Image to Depth: Improving Depth Prediction using Echoes (CVPR 2021) Kranti Kumar Parida, Siddharth Srivastava, Gaurav Sharma. We address the pr

Kranti Kumar Parida 33 Jun 27, 2022
An open source machine learning library for performing regression tasks using RVM technique.

Introduction neonrvm is an open source machine learning library for performing regression tasks using RVM technique. It is written in C programming la

Siavash Eliasi 33 May 31, 2022
Official repository of the paper Learning to Regress 3D Face Shape and Expression from an Image without 3D Supervision

Official repository of the paper Learning to Regress 3D Face Shape and Expression from an Image without 3D Supervision

Soubhik Sanyal 689 Dec 25, 2022
Patches desktop steam to look like the new steamdeck ui.

steam_deck_ui_patch The Deck UI patch will patch the regular desktop steam to look like the brand new SteamDeck UI. This patch tool currently works on

The_IT_Dude 3 Aug 29, 2022