PyTorch code for the paper "FIERY: Future Instance Segmentation in Bird's-Eye view from Surround Monocular Cameras"

Related tags

Deep Learningfiery


This is the PyTorch implementation for inference and training of the future prediction bird's-eye view network as described in:

FIERY: Future Instance Segmentation in Bird's-Eye view from Surround Monocular Cameras

Anthony Hu, Zak Murez, Nikhil Mohan, Sofía Dudas, Jeffrey Hawke, ‪Vijay Badrinarayanan, Roberto Cipolla and Alex Kendall

preprint (2021)
Blog post

FIERY future prediction
Multimodal future predictions by our bird’s-eye view network.
Top two rows: RGB camera inputs. The predicted future trajectories and segmentations are projected to the ground plane in the images.
Bottom row: future instance prediction in bird’s-eye view in a 100m×100m capture size around the ego-vehicle, which is indicated by a black rectangle in the center.

If you find our work useful, please consider citing:

  title     = {{FIERY}: Future Instance Segmentation in Bird's-Eye view from Surround Monocular Cameras},
  author    = {Anthony Hu and Zak Murez and Nikhil Mohan and Sofía Dudas and 
               Jeffrey Hawke and Vijay Badrinarayanan and Roberto Cipolla and Alex Kendall},
  booktitle = {arXiv preprint},
  year = {2021}


  • Create the conda environment by running conda env create.

🏄 Prediction

🔥 Pre-trained models

All the configs are in the folder fiery/configs

Config Dataset Past context Future horizon BEV size IoU VPQ
baseline.yml NuScenes 1.0s 2.0s 100mx100m (50cm res.) 37.0 29.5
lyft/baseline.yml Lyft 0.8s 2.0s 100mx100m (50cm res.) 36.6 29.5
literature/pon_setting.yml NuScenes 0.0s 0.0s 100mx50m (25cm res.) 39.9 -
literature/lift_splat_setting.yml NuScenes 0.0s 0.0s 100mx100m (50cm res.) 36.7 -
literature/fishing_setting.yml NuScenes 1.0s 2.0s 32.0mx19.2m (10cm res.) 58.5 -

🏊 Training

To train the model from scratch on NuScenes:

  • Run python --config fiery/configs/baseline.yml DATASET.DATAROOT ${NUSCENES_DATAROOT}

🙌 Credits

Big thanks to Piotr Sokólski (@pyetras) for the panoptic metric implementation, and to Hannes Liik (@hannesliik) for the awesome future trajectory visualisation on the ground plane.

  • loss < 0

    loss < 0

    Hi, thanks for your great work. I have a question about loss. When i trained model for my data, the loss < 0 at epoch_0, is this normal? Config: baseline.yaml in the project image

    opened by YiJiangYue 6
  • All losses become NaN after about 1 epoch of training

    All losses become NaN after about 1 epoch of training


    Thank you for sharing this great work!

    When I ran the training code, I got NaN for all losses after about 1 epoch of training. This problem is reproduced whenever I run the training code. (I have tested it three times.)

    I followed the same environment setting with anaconda, and also used the same hyper-parameters. (The only difference is that our PyTorch version is 1.7.1 and yours is 1.7.0, and all other modules are the same as yours.)

    Please share your idea about this problem, if you have any. Thanks!

    opened by jwookyoo 6
  • Question about the projection_to_birds_eye_view function

    Question about the projection_to_birds_eye_view function

    Congratulations on your great work!

    I want to follow your work for future research and I have some questions about your released code below:

    In the file of your code, can you provide more details about the get_geometry function and the projection_to_birds_eye_view function? I'm so confused about how they actually work, especially, the code shown in the red box below. 112a26f400a5c24af541a6423977362

    Thank you very much. Looking forward to your reply!

    opened by taylover-pei 5
  • AttributeError: 'FigureCanvasTkAgg' object has no attribute 'renderer'

    AttributeError: 'FigureCanvasTkAgg' object has no attribute 'renderer'

    Hello, recently I found your great work and I want to try the "Visualisation" part locally to check the results, but after I run the command of python --checkpoint ${CHECKPOINT_PATH} my terminal pop out an error like the following: image

    I try to solve it by searching on google but it does not help, could you help me if you know how to solve it. Many thanks.

    opened by Ianpengg 3
  • May I know where is the checkpoint getting saved?

    May I know where is the checkpoint getting saved?

    I dont see anywhere that the checkpoint is getting saved and while resuming the training, I am getting an error that "size mismatch for model.temporal_model.model.1.aggregation.0.conv.weight"

    opened by pranavi77 2
  • The result of fiery static

    The result of fiery static

    If I want to get the result of Fiery Static of Setting2 in Table I of your paper, should I use the config in "configs/single_timeframe.yml"? When I train the network using this config file from scratch, the IOU is 39.2 when I use the "". However, in the paper, the result is 35.8. Is there another parameter needed to be modified, when I want the network to be one frame as input and the segmentation result of the present frame as output?

    opened by DFLyan 2
  • Question about panoptic_metrics function

    Question about panoptic_metrics function


    Would you be able to explain how the panoptic_metrics function works? (Code linked here: Especially, I wonder why 'void' is included for 'combine_mask', and why 'background' should be changed from 0 to 1.

    Also, It is hard to understand the code under the comment "# hack for bincounting 2 arrays together". (Code linked here:

    Thank you!

    opened by jwookyoo 2
  • Is future_egopose necessary for inference?

    Is future_egopose necessary for inference?

    Thanks for your great work. I have a little question about future ego pose during inference? I may find a little tricky because flow prediction is a module before motion planning. In real cases, the flow prediction module has no chance of getting future ego pose. But the code may show future ego pose is irreplacable in inference. When I turn to None, the inference doesn't work.

    opened by synsin0 1
  • clarification evaluation

    clarification evaluation

    Hello and many thanks for your work and sharing your code.

    I have a question regarding the way you compute your IoU metric and how it compares against Lift-splat.

    You use stat_scores_multiple_classes from PLmetrics to compute the iou. Correct me if I am wrong, but by default the threshold of this method is 0.5

    On the other hand, in get_batch_iou of LFS they use a threshold of 0: pred = (preds > 0)

    Wouldn't this have an impact on the evaluation results ,and thus, on how you compare to them ?

    opened by F-Barto 1
  • Question on deleting unused layers and self.downsample

    Question on deleting unused layers and self.downsample

    Hi, I couldn't understand how the self.downsample parameter was set (why 8 and 16 and how it affects upsampling_in_channels) and why delete_unused_layers is required in the encoder model. I tried to search the efficientnet-pytorch implementation and couldn't find any reference for this operation. Could you explain briefly why this is required? Thank you!

    opened by benhgm 1
  • question about instance_flow

    question about instance_flow

    Thanks for your excellent work! I have some questions about instance_flow. warped_instance_seg = {} # t0,f01-->t1; t1,f12-->t2; t2,f23-->t3 # t1,f10-->t0; t2,f21-->t1 for t in range(1, seq_len): warped_inst_t = warp_features(instance_img[t].unsqueeze(0).unsqueeze(1).float(), # 1, 1, 200, 200 future_egomotion_inv[t - 1].unsqueeze(0), mode='nearest', spatial_extent=spatial_extent) warped_instance_seg[t] = warped_inst_t[0, 0] In your paper, "Finally, we obtain feature flow labels by comparing the position of the instance centers of gravity between two consecutive timesteps".I think the code should convert t to t-1, not t-1 to t. How can it get the feature flow? I'm really confuesd about it. I'm looking forward your replying.

    opened by qfwysw 1
  • Bad results when evaluating pretrained checkpoints

    Bad results when evaluating pretrained checkpoints

    Hi. Thanks for your great work. I followed your instructions in to extract nuscenes dataset. I ran with official pretrained checkpoint ( but got the output as follows: iou 53.5 & 28.6 pq 39.8 & 18.0 sq 69.4 & 66.3 rq 57.4 & 27.1 Is there something wrong? It seems to be much lower than the results you got.

    opened by huangzhengxiang 1
  • Dear author,the total loss value <0 ,is it normal?

    Dear author,the total loss value <0 ,is it normal?

    Dear author, I just run the code without no change, during the training ,I got the total sum loss with the value <0 .

    It looks so weird. Is that caused by the setting of the "uncertainty" ? Is that normal? Really thanks.

    opened by emilyemliyM 0
  • Pytorch Lightning stuck the computer and finally killed

    Pytorch Lightning stuck the computer and finally killed

    Thanks for your great work. I'd like to reproduce the training process, but I encountered an error. That is when I use multi-GPU distributed training process, the logging information seems normal, but afterwards the remote server stuck and connection reset and finally the process is killed. My remote server is an independent machine with 4xRTX3090. Is there any issues with the pytorch lightning distributed training that may cause my failure?

    opened by synsin0 1
Data and Code for ACL 2021 Paper "Inter-GPS: Interpretable Geometry Problem Solving with Formal Language and Symbolic Reasoning"

Introduction Code and data for ACL 2021 Paper "Inter-GPS: Interpretable Geometry Problem Solving with Formal Language and Symbolic Reasoning". We cons

Pan Lu 81 Dec 27, 2022
Pytorch Implementation of Auto-Compressing Subset Pruning for Semantic Image Segmentation

Pytorch Implementation of Auto-Compressing Subset Pruning for Semantic Image Segmentation Introduction ACoSP is an online pruning algorithm that compr

Merantix 8 Dec 07, 2022

fcanet-Paddle 基于Paddle框架的fcanet复现 fcanet 本项目基于paddlepaddle框架复现fcanet,并参加百度第三届论文复现赛,将在2021年5月15日比赛完后提供AIStudio链接~敬请期待 参考项目: frazerlin-fcanet 数据准备 本项目已挂

QuanHao Guo 7 Mar 07, 2022
TensorFlow Implementation of "Show, Attend and Tell"

Show, Attend and Tell Update (December 2, 2016) TensorFlow implementation of Show, Attend and Tell: Neural Image Caption Generation with Visual Attent

Yunjey Choi 902 Nov 29, 2022
This is the repository for our paper SimpleTrack: Understanding and Rethinking 3D Multi-object Tracking

SimpleTrack This is the repository for our paper SimpleTrack: Understanding and Rethinking 3D Multi-object Tracking. We are still working on writing t

TuSimple 189 Dec 26, 2022
Time-stretch audio clips quickly with PyTorch (CUDA supported)! Additional utilities for searching efficient transformations are included.

Time-stretch audio clips quickly with PyTorch (CUDA supported)! Additional utilities for searching efficient transformations are included.

Kento Nishi 22 Jul 07, 2022
The Official PyTorch Implementation of DiscoBox.

DiscoBox: Weakly Supervised Instance Segmentation and Semantic Correspondence from Box Supervision Paper | Project page | Demo (Youtube) | Demo (Bilib

NVIDIA Research Projects 89 Jan 09, 2023
Contrastive Learning of Image Representations with Cross-Video Cycle-Consistency

Contrastive Learning of Image Representations with Cross-Video Cycle-Consistency This is a official implementation of the CycleContrast introduced in

13 Nov 14, 2022
PyTorch implementation of Memory-based semantic segmentation for off-road unstructured natural environments.

MemSeg: Memory-based semantic segmentation for off-road unstructured natural environments Introduction This repository is a PyTorch implementation of

11 Nov 28, 2022
Rethinking the U-Net architecture for multimodal biomedical image segmentation

MultiResUNet Rethinking the U-Net architecture for multimodal biomedical image segmentation This repository contains the original implementation of "M

Nabil Ibtehaz 308 Jan 05, 2023
The repository offers the official implementation of our BMVC 2021 paper in PyTorch.

CrossMLP Cascaded Cross MLP-Mixer GANs for Cross-View Image Translation Bin Ren1, Hao Tang2, Nicu Sebe1. 1University of Trento, Italy, 2ETH, Switzerla

Bingoren 16 Jul 27, 2022
SymPy-powered, Wolfram|Alpha-like answer engine totally in your browser, without backend computation

SymPy Beta SymPy Beta is a fork of SymPy Gamma. The purpose of this project is to run a SymPy-powered, Wolfram|Alpha-like answer engine totally in you

Liumeo 25 Dec 21, 2022
The implementation code for "DAGAN: Deep De-Aliasing Generative Adversarial Networks for Fast Compressed Sensing MRI Reconstruction"

DAGAN This is the official implementation code for DAGAN: Deep De-Aliasing Generative Adversarial Networks for Fast Compressed Sensing MRI Reconstruct

TensorLayer Community 159 Nov 22, 2022
Implementation of "With a Little Help from my Temporal Context: Multimodal Egocentric Action Recognition, BMVC, 2021" in PyTorch

Multimodal Temporal Context Network (MTCN) This repository implements the model proposed in the paper: Evangelos Kazakos, Jaesung Huh, Arsha Nagrani,

Evangelos Kazakos 13 Nov 24, 2022
PyTorch-centric library for evaluating and enhancing the robustness of AI technologies

Responsible AI Toolbox A library that provides high-quality, PyTorch-centric tools for evaluating and enhancing both the robustness and the explainabi

24 Dec 22, 2022
PyTorch implementation for "Sharpness-aware Quantization for Deep Neural Networks".

Sharpness-aware Quantization for Deep Neural Networks This is the official repository for our paper: Sharpness-aware Quantization for Deep Neural Netw

Zhuang AI Group 30 Dec 19, 2022
A deep learning tabular classification architecture inspired by TabTransformer with integrated gated multilayer perceptron.

The GatedTabTransformer. A deep learning tabular classification architecture inspired by TabTransformer with integrated gated multilayer perceptron. C

Radi Cho 60 Dec 15, 2022
Deep Video Matting via Spatio-Temporal Alignment and Aggregation [CVPR2021]

Deep Video Matting via Spatio-Temporal Alignment and Aggregation [CVPR2021] Paper: Introduction Despite the significa

76 Dec 07, 2022
VD-BERT: A Unified Vision and Dialog Transformer with BERT

VD-BERT: A Unified Vision and Dialog Transformer with BERT PyTorch Code for the following paper at EMNLP2020: Title: VD-BERT: A Unified Vision and Dia

Salesforce 44 Nov 01, 2022
LoveDA: A Remote Sensing Land-Cover Dataset for Domain Adaptive Semantic Segmentation (NeurIPS2021 Benchmark and Dataset Track)

LoveDA: A Remote Sensing Land-Cover Dataset for Domain Adaptive Semantic Segmentation by Junjue Wang, Zhuo Zheng, Ailong Ma, Xiaoyan Lu, and Yanfei Zh

Kingdrone 174 Dec 22, 2022