Official Repo for Ground-aware Monocular 3D Object Detection for Autonomous Driving

Last update: Dec 19, 2022

Overview

Visual 3D Detection Package:

This repo aims to provide flexible and reproducible visual 3D detection on KITTI dataset. We expect scripts starting from the current directory, and treat ./visualDet3D as a package that we could modify and test directly instead of a library. Several useful scripts are provided in the main directory for easy usage.

We believe that visual tasks are interconnected, so we make this library extensible to more experiments. The package uses registry to register datasets, models, processing functions and more allowing easy inserting of new tasks/models while not interfere with the existing ones.

Related Paper:

This repo contains the official implementation of 2021 RAL paper Ground-aware Monocular 3D Object Detection for Autonomous Driving. Arxiv Page. Pretrained model can be found at release pages.

@ARTICLE{9327478,
  author={Y. {Liu} and Y. {Yuan} and M. {Liu}},
  journal={IEEE Robotics and Automation Letters}, 
  title={Ground-aware Monocular 3D Object Detection for Autonomous Driving}, 
  year={2021},
  doi={10.1109/LRA.2021.3052442}}

Key Features

SOTA Performance State of the art result on visual 3D detection.
Modular Design Modular design for dataset, network and running pipelines.
Support Various Task Compatible with the training and testing of mono/stereo 3D detection and depth prediction.
Distributed & Single GPU Support training with multiple GPUs.
Installation-Free Setup The setup process only build operations and does not require installation to keep the environment clean.
Global Path-based IMDB Do not need data placed inside the folder, convienient for managing data and code separately.

We provide start-up solutions for Mono3D, Depth Predictions and more (until further publication).

Reference: this repo borrows codes and ideas from retinanet, mmdetection, M3D-RPN, DORN, EdgeNets, det3

Setup

Environment setup.

pip3 install -r requirement.txt

or manually check dependencies.

# build ops (deform convs), We will not install operations into the system environment
./make.sh

Start Training

Please check the corresponding task: Mono3D, Depth Predictions. More demo will be available through contributions and further paper submission.

Config and Path setup.

Please modify the path and other parameters in config/*.py. config/*_example files are templates.

Notice: *_examples are NOT utilized by the code and *.py under /config is ignored by .gitignore.

The content of the selected config file will be recorded in tensorboard at the beginning of training.

important paths to modify in config :

cfg.path.data_path: Path to KITTI training data. We expect calib, image_2, image_3, label_2 being the subfolder (directly unzipping the downloaded zips will be fine)
cfg.path.test_path: Path to KITTI testing data. We expect calib, image_2 being the subfolder.
cfg.path.visualDet3D_path: Path to the "visualDet3D" directorty of the current repo
cfg.path.project_path: Path to the workdirs of the projects (will have temp_outputs, log, checkpoints)

Please check the template's comments and other comments in codes to fully exploit the repo.

Further Info and Bug Issues

Open issues on the repo if you meet troubles or find a bug or have some suggestions.
Email to [email protected]

Other Resources

Related Codes

Comments

Could you provide your results in each config?

Could you provide your results in each config? Thanks for your work. Because my results using your code is lower than that in papers. I would appreciate if you provide the results for each config. And if it's possible, could you provide a docker for easily install(some machine can't install well, due to conflict among packages?)

And my results as follows: (It seems something wrong in Yolo3D_example and KM3D_example) In Ground-aware:)

Car AP(Average Precision)@0.70, 0.70, 0.70:
bbox AP:91.76, 79.74, 62.81
bev  AP:22.39, 17.37, 13.54   
3d   AP:16.80, 12.73, 10.22    <----    In paper: 22.16 | 15.71 | 11.75
aos  AP:90.96, 78.19, 61.48
Car AP(Average Precision)@0.70, 0.50, 0.50:
bbox AP:91.76, 79.74, 62.81
bev  AP:56.30, 41.20, 32.98
3d   AP:51.48, 37.65, 29.91
aos  AP:90.96, 78.19, 61.48

In monoflex:

Car AP(Average Precision)@0.70, 0.70, 0.70:
bbox AP:97.04, 91.58, 81.64
bev  AP:30.90, 22.72, 19.22
3d   AP:22.91, 16.49, 13.59    <----    In paper:   23.64 | 17.51 | 14.83
aos  AP:96.92, 91.25, 81.16
Car AP(Average Precision)@0.70, 0.50, 0.50:
bbox AP:97.04, 91.58, 81.64
bev  AP:66.93, 49.88, 43.21
3d   AP:62.03, 46.13, 39.80
aos  AP:96.92, 91.25, 81.16

In RTM3d:

Car AP(Average Precision)@0.70, 0.70, 0.70:
bbox AP:96.98, 88.85, 78.72
bev  AP:16.05, 12.08, 9.98    <----    In paper:  27.83 | 23.38 | 21.69
3d   AP:10.20, 7.86, 6.26      <----    In paper:   22.50 | 19.60 | 17.12
aos  AP:96.45, 87.73, 77.50
Car AP(Average Precision)@0.70, 0.50, 0.50:
bbox AP:96.98, 88.85, 78.72
bev  AP:50.02, 37.08, 30.29
3d   AP:43.69, 32.07, 26.78
aos  AP:96.45, 87.73, 77.50

opened by mrsempress 10

Are the detection results is on test split or validation split?

I find the detailed detection results on web page : https://github.com/Owen-Liuyuxuan/visualDet3D/releases/tag/1.1. I wander to which data split is the detection results based on. Thank you.

opened by jichaofeng 6
Did you try to directly regress the x, y, z of the 3D bounding center?

Thanks for sharing the excellent work and the relevant code!

My question is:

After my own experiments and some slight changes, I found the performance (3D IOU) of the model was pretty bad if I directly regress the coordinate of the 3D bbox centre. However, the 2D Iou was better and reached 90% in the first few epochs. I am really confused about this result. Have you tried to directly regress these parameters? Do you have any idea about it?

These results are generated by resnet18 on Chen Split.

Evaluation result of regressing alpha Car AP(Average Precision)@0.70, 0.70, 0.70: bbox AP:88.38, 71.46, 64.18 bev AP:8.49, 5.02, 4.39 3d AP:2.49, 1.81, 1.51 aos AP:88.33, 71.40, 64.11 Car AP(Average Precision)@0.70, 0.50, 0.50: bbox AP:88.38, 71.46, 64.18 bev AP:51.12, 37.06, 34.06 3d AP:31.00, 20.17, 19.07 aos AP:88.33, 71.40, 64.11

Evaluation result of regressing x, y, z and theta Car AP(Average Precision)@0.70, 0.70, 0.70: bbox AP:96.50, 80.83, 63.48 bev AP:0.75, 0.43, 0.38 3d AP:0.22, 0.21, 0.08 Car AP(Average Precision)@0.70, 0.50, 0.50: bbox AP:96.50, 80.83, 63.48 bev AP:6.38, 4.43, 3.37 3d AP:4.52, 3.50, 2.40

Also, I noticed that the result on the validation set is pretty good using the full model. But it seems that there is a huge performance gap between the validation set and the test set on the KITTI server. So is it right that currently, the main problem is overfitting?

opened by yilinliu77 6
ImportError: cannot import name 'deform_conv_ext' from 'visualDet3D.networks.lib.ops.dcn' (/home/cv1/visualDet3D/visualDet3D/networks/lib/ops/dcn/__init__.py)

Hi, I just get some error when I ran mono3d.

In this step,

Compute image database and anchors mean/std

You can run ./launcher/det_precompute.sh without arguments to see helper documents

./launcher/det_precompute.sh config/$CONFIG_FILE.py train ./launcher/det_precompute.sh config/$CONFIG_FILE.py test # only for upload testing

I got a message that :

Traceback (most recent call last): File "scripts/imdb_precompute_3d.py", line 11, in from visualDet3D.networks.heads.anchors import Anchors File "/home/cv1/visualDet3D/visualDet3D/networks/init.py", line 1, in from .pipelines import * File "/home/cv1/visualDet3D/visualDet3D/networks/pipelines/init.py", line 3, in from .evaluators import evaluate_kitti_obj File "/home/cv1/visualDet3D/visualDet3D/networks/pipelines/evaluators.py", line 16, in from visualDet3D.data.kitti.utils import write_result_to_file File "/home/cv1/visualDet3D/visualDet3D/data/init.py", line 1, in from .kitti.dataset import mono_dataset, depth_mono_dataset File "/home/cv1/visualDet3D/visualDet3D/data/kitti/init.py", line 1, in from .dataset import KittiMonoDataset, KittiMonoTestDataset File "/home/cv1/visualDet3D/visualDet3D/data/kitti/dataset/init.py", line 4, in from .KM3D_dataset import KittiRTM3DDataset File "/home/cv1/visualDet3D/visualDet3D/data/kitti/dataset/KM3D_dataset.py", line 19, in from visualDet3D.networks.utils.rtm3d_utils import gen_hm_radius, project_to_image, gaussian_radius File "/home/cv1/visualDet3D/visualDet3D/networks/utils/rtm3d_utils.py", line 6, in from visualDet3D.networks.lib.ops.iou3d.iou3d import boxes_iou3d_gpu File "/home/cv1/visualDet3D/visualDet3D/networks/lib/ops/init.py", line 1, in from .dcn.deform_conv import ModulatedDeformConvPack, DeformConvPack File "/home/cv1/visualDet3D/visualDet3D/networks/lib/ops/dcn/deform_conv.py", line 50, in from . import deform_conv_ext

ImportError: cannot import name 'deform_conv_ext' from 'visualDet3D.networks.lib.ops.dcn' (/home/cv1/visualDet3D/visualDet3D/networks/lib/ops/dcn/init.py)

So, How can I fix that error?

opened by sjg02122 6
【Problem】iou3d_cuda: undefined symbol

Hi,thanks for you nice work! I've run sh ./make.shand the program have created a iou3d_cuda.cpython-37m-x86_64-linux-gnu.so document. But when I run sh ./launchers/det_precompute.sh config/Yolo3D_example.py train ,it turns out a traceback saying */iou3d/iou3d_cuda.cpython-37m-x86_64-linux-gnu.so: undefined symbol: _Z17nmsNormalLauncherPKfPyif. Could you please tell me how to solve this problem?Thank you!

opened by BigHandsome-Lin 6
when i run "./launchers/eval.sh config/yolo_stereo.py 0 checkpoint/Stereo3D_latest.pth validation" or on test data

当在测试集和验证集跑数据的时候，都会出现如下问题，不理解为什么NMS的输入的分数和检测框的长度会有不同

CUDA available: True pickle/Stereo3D/output/test/imdb.pkl Found evaluate function clean up the recorder directory of pickle/Stereo3D/output/test/data rebuild pickle/Stereo3D/output/test/data 0%| | 0/7518 [00:00<?, ?it/s]/home/lishengwen/code/visualDet3D/visualDet3D/networks/lib/PSM_cost_volume.py:82: UserWarning: volatile was removed and now has no effect. Use with torch.no_grad(): instead. cost = Variable( PSM Cos Volume takes 0.002985239028930664 seconds at call time 1 /home/lishengwen/code/visualDet3D/visualDet3D/networks/lib/PSM_cost_volume.py:49: UserWarning: volatile was removed and now has no effect. Use with torch.no_grad(): instead. cost = Variable( 0%| | 1/7518 [00:00<43:24, 2.89it/s]PSM Cos Volume takes 0.0035042762756347656 seconds at call time 2 PSM Cos Volume takes 0.0029206275939941406 seconds at call time 3 Cost Volume takes 0.0025892257690429688 seconds at call time 1 0%| | 2/7518 [00:00<29:20, 4.27it/s]PSM Cos Volume takes 0.0034818649291992188 seconds at call time 4 PSM Cos Volume takes 0.002855062484741211 seconds at call time 5 Cost Volume takes 0.0025739669799804688 seconds at call time 2 0%| | 3/7518 [00:00<23:13, 5.39it/s]PSM Cos Volume takes 0.0034427642822265625 seconds at call time 6 PSM Cos Volume takes 0.0028700828552246094 seconds at call time 7 Cost Volume takes 0.002550363540649414 seconds at call time 3 0%| | 4/7518 [00:00<21:47, 5.75it/s]PSM Cos Volume takes 0.0047397613525390625 seconds at call time 8 PSM Cos Volume takes 0.005141258239746094 seconds at call time 9 Cost Volume takes 0.00797271728515625 seconds at call time 4 0%| | 5/7518 [00:00<22:43, 5.51it/s]PSM Cos Volume takes 0.003518819808959961 seconds at call time 10 PSM Cos Volume takes 0.002848386764526367 seconds at call time 11 Cost Volume takes 0.0024688243865966797 seconds at call time 5 0%| | 6/7518 [00:01<22:04, 5.67it/s]PSM Cos Volume takes 0.004822254180908203 seconds at call time 12 PSM Cos Volume takes 0.003919839859008789 seconds at call time 13 Cost Volume takes 0.007089376449584961 seconds at call time 6 0%| | 7/7518 [00:01<22:03, 5.67it/s]PSM Cos Volume takes 0.0034933090209960938 seconds at call time 14 PSM Cos Volume takes 0.0028336048126220703 seconds at call time 15 Cost Volume takes 0.0024688243865966797 seconds at call time 7 0%|▏ | 8/7518 [00:01<20:27, 6.12it/s]PSM Cos Volume takes 0.0042951107025146484 seconds at call time 16 PSM Cos Volume takes 0.005505800247192383 seconds at call time 17 Cost Volume takes 0.010698080062866211 seconds at call time 8 0%|▏ | 9/7518 [00:01<21:04, 5.94it/s]PSM Cos Volume takes 0.0034551620483398438 seconds at call time 18 PSM Cos Volume takes 0.002899646759033203 seconds at call time 19 Cost Volume takes 0.0024847984313964844 seconds at call time 9 2%|██▉ | 178/7518 [00:26<17:56, 6.82it/s] Traceback (most recent call last): File "scripts/eval.py", line 55, in fire.Fire(main) File "/home/lishengwen/anaconda3/envs/visualDet3D/lib/python3.8/site-packages/fire/core.py", line 141, in Fire component_trace = _Fire(component, args, parsed_flag_args, context, name) File "/home/lishengwen/anaconda3/envs/visualDet3D/lib/python3.8/site-packages/fire/core.py", line 466, in _Fire component, remaining_args = _CallAndUpdateTrace( File "/home/lishengwen/anaconda3/envs/visualDet3D/lib/python3.8/site-packages/fire/core.py", line 681, in _CallAndUpdateTrace component = fn(*varargs, **kwargs) File "scripts/eval.py", line 52, in main evaluate_detection(cfg, detector, dataset, None, 0, result_path_split=split_to_test) File "/home/lishengwen/anaconda3/envs/visualDet3D/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 26, in decorate_context return func(*args, **kwargs) File "/home/lishengwen/code/visualDet3D/visualDet3D/networks/pipelines/evaluators.py", line 85, in evaluate_kitti_obj test_one(cfg, index, dataset_val, model, test_func, backprojector, projector, result_path) File "/home/lishengwen/code/visualDet3D/visualDet3D/networks/pipelines/evaluators.py", line 111, in test_one scores, bbox, obj_names = test_func(collated_data, model, None, cfg=cfg) File "/home/lishengwen/anaconda3/envs/visualDet3D/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 26, in decorate_context return func(*args, **kwargs) File "/home/lishengwen/code/visualDet3D/visualDet3D/networks/pipelines/testers.py", line 39, in test_stereo_detection scores, bbox, obj_index = module([left_images.cuda().float().contiguous(), right_images.cuda().float().contiguous(), torch.tensor(P2).cuda().float(), torch.tensor(P3).cuda().float()]) File "/home/lishengwen/anaconda3/envs/visualDet3D/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/home/lishengwen/code/visualDet3D/visualDet3D/networks/detectors/yolostereo3d_detector.py", line 103, in forward return self.test_forward(*inputs) File "/home/lishengwen/code/visualDet3D/visualDet3D/networks/detectors/yolostereo3d_detector.py", line 93, in test_forward scores, bboxes, cls_indexes = self.bbox_head.get_bboxes(cls_preds, reg_preds, anchors, P2, left_images) File "/home/lishengwen/code/visualDet3D/visualDet3D/networks/heads/detection_3d_head.py", line 385, in get_bboxes keep_inds = nms(bboxes[:, :4], max_score, nms_iou_thr) File "/home/lishengwen/anaconda3/envs/visualDet3D/lib/python3.8/site-packages/torchvision/ops/boxes.py", line 42, in nms return torch.ops.torchvision.nms(boxes, scores, iou_threshold) RuntimeError: boxes and scores should have same number of elements in dimension 0, got 495 and 494

opened by monsters-s 5
I don't know why I could not reproduce the validation results.

For the Car class, The validation results are: bev AP: 73.51 48.49 37.47 3D AP: 62.28 39.16 30.51 Do you computing the disparity map using the point cloud or openCV BlockMatching?

opened by jichaofeng 5
Template problem

Hello, first of all, thank you for doing such an excellent job. I would like to ask you a question, is the Yolo3D_example file in the config folder the corresponding work in the paper? If I want to reproduce the paper code I should use this template. Thank you again .

opened by lgq-gpu 4

测试多个类别的时候报错

@Owen-Liuyuxuan 你好，我在测试只有Car一个类别的时候没有问题，然后我想测试3个类别，就按照你上面的提示，在Yolo3D_example.py文件中修改了对应的三行代码，然后报错如下：

(yolov4) [email protected]:~/Disk2/3_proj/visualDet3D$ ./launchers/eval.sh config/Yolo3D_example.py 0 workdirs/Mono3D/checkpoint/GroundAware_pretrained.pth test
CUDA available: True
Traceback (most recent call last):
  File "scripts/eval.py", line 55, in <module>
    fire.Fire(main)
  File "/home/shl/anaconda3/envs/yolov4/lib/python3.6/site-packages/fire/core.py", line 141, in Fire
    component_trace = _Fire(component, args, parsed_flag_args, context, name)
  File "/home/shl/anaconda3/envs/yolov4/lib/python3.6/site-packages/fire/core.py", line 471, in _Fire
    target=component.__name__)
  File "/home/shl/anaconda3/envs/yolov4/lib/python3.6/site-packages/fire/core.py", line 681, in _CallAndUpdateTrace
    component = fn(*varargs, **kwargs)
  File "scripts/eval.py", line 37, in main
    detector = DETECTOR_DICT[cfg.detector.name](cfg.detector)
  File "/media/shl/2b7567a9-ed41-40af-8673-4d47b46a0f32/home/zhihui/Disk2/3_proj/visualDet3D/visualDet3D/networks/detectors/yolomono3d_detector.py", line 65, in __init__
    self.build_head(network_cfg)
  File "/media/shl/2b7567a9-ed41-40af-8673-4d47b46a0f32/home/zhihui/Disk2/3_proj/visualDet3D/visualDet3D/networks/detectors/yolomono3d_detector.py", line 137, in build_head
    **(network_cfg.head)
  File "/media/shl/2b7567a9-ed41-40af-8673-4d47b46a0f32/home/zhihui/Disk2/3_proj/visualDet3D/visualDet3D/networks/heads/detection_3d_head.py", line 32, in __init__
    self.anchors = Anchors(preprocessed_path=preprocessed_path, readConfigFile=read_precompute_anchor, **anchors_cfg)
  File "/media/shl/2b7567a9-ed41-40af-8673-4d47b46a0f32/home/zhihui/Disk2/3_proj/visualDet3D/visualDet3D/networks/heads/anchors.py", line 36, in __init__
    self.anchors_mean_original[i]  = np.load(npy_file) #[30, 2, 6] #[z,  sinalpha, cosalpha, w, h, l,]
ValueError: could not broadcast input array from shape (16,2,6) into shape (16,3,6)
(yolov4) [email protected]:~/Disk2/3_proj/visualDet3D$

请问这个错误是需要修改anchor_mean_Car.npy和anchor_std_Car.npy 这两个文件吗，我目前不知道怎么解决这个问题，还希望你可以抽空帮忙一下，谢谢啦

Originally posted by @shliang0603 in https://github.com/Owen-Liuyuxuan/visualDet3D/issues/17#issuecomment-897407923

opened by shliang0603 4

error occured in install step . /make.sh

envs: os: ubuntu18.04 cuda: nvcc 10.1 python: 3.7.10 torch: 1.8.1 tensorflow: 2.5.0

When run ./make.sh error occured as follow:

(ground_aware) [email protected]:/zyf/code/visualDet3D# ./make.sh /zyf/code/visualDet3D/visualDet3D/networks/lib/ops/dcn /zyf/code/visualDet3D running build_ext building '..deform_conv_ext' extension creating /zyf/code/visualDet3D/visualDet3D/networks/lib/ops/dcn/build creating /zyf/code/visualDet3D/visualDet3D/networks/lib/ops/dcn/build/temp.linux-x86_64-3.7 creating /zyf/code/visualDet3D/visualDet3D/networks/lib/ops/dcn/build/temp.linux-x86_64-3.7/src creating /zyf/code/visualDet3D/visualDet3D/networks/lib/ops/dcn/build/temp.linux-x86_64-3.7/src/cuda Emitting ninja build file /zyf/code/visualDet3D/visualDet3D/networks/lib/ops/dcn/build/temp.linux-x86_64-3.7/build.ninja... Compiling objects... Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N) [1/3] /usr/local/cuda/bin/nvcc --generate-dependencies-with-compile --dependency-output /zyf/code/visualDet3D/visualDet3D/networks/lib/ops/dcn/build/temp.linux-x86_64-3.7/src/cuda/deform_conv_cuda_kernel.o.d -DWITH_CUDA -I/root/anaconda3/envs/ground_aware/lib/python3.7/site-packages/torch/include -I/root/anaconda3/envs/ground_aware/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -I/root/anaconda3/envs/ground_aware/lib/python3.7/site-packages/torch/include/TH -I/root/anaconda3/envs/ground_aware/lib/python3.7/site-packages/torch/include/THC -I/usr/local/cuda/include -I/root/anaconda3/envs/ground_aware/include/python3.7m -c -c /zyf/code/visualDet3D/visualDet3D/networks/lib/ops/dcn/src/cuda/deform_conv_cuda_kernel.cu -o /zyf/code/visualDet3D/visualDet3D/networks/lib/ops/dcn/build/temp.linux-x86_64-3.7/src/cuda/deform_conv_cuda_kernel.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="gcc"' '-DPYBIND11_STDLIB="libstdcpp"' '-DPYBIND11_BUILD_ABI="cxxabi1011"' -DTORCH_EXTENSION_NAME=deform_conv_ext -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_60,code=sm_60 -gencode=arch=compute_61,code=sm_61 -gencode=arch=compute_70,code=compute_70 -gencode=arch=compute_70,code=sm_70 -std=c++14 FAILED: /zyf/code/visualDet3D/visualDet3D/networks/lib/ops/dcn/build/temp.linux-x86_64-3.7/src/cuda/deform_conv_cuda_kernel.o /usr/local/cuda/bin/nvcc --generate-dependencies-with-compile --dependency-output /zyf/code/visualDet3D/visualDet3D/networks/lib/ops/dcn/build/temp.linux-x86_64-3.7/src/cuda/deform_conv_cuda_kernel.o.d -DWITH_CUDA -I/root/anaconda3/envs/ground_aware/lib/python3.7/site-packages/torch/include -I/root/anaconda3/envs/ground_aware/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -I/root/anaconda3/envs/ground_aware/lib/python3.7/site-packages/torch/include/TH -I/root/anaconda3/envs/ground_aware/lib/python3.7/site-packages/torch/include/THC -I/usr/local/cuda/include -I/root/anaconda3/envs/ground_aware/include/python3.7m -c -c /zyf/code/visualDet3D/visualDet3D/networks/lib/ops/dcn/src/cuda/deform_conv_cuda_kernel.cu -o /zyf/code/visualDet3D/visualDet3D/networks/lib/ops/dcn/build/temp.linux-x86_64-3.7/src/cuda/deform_conv_cuda_kernel.o -D__CUDA_NO_HALF_OPERATORS -D__CUDA_NO_HALF_CONVERSIONS_ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=deform_conv_ext -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_60,code=sm_60 -gencode=arch=compute_61,code=sm_61 -gencode=arch=compute_70,code=compute_70 -gencode=arch=compute_70,code=sm_70 -std=c++14 nvcc fatal : Unknown option '-generate-dependencies-with-compile' [2/3] c++ -MMD -MF /zyf/code/visualDet3D/visualDet3D/networks/lib/ops/dcn/build/temp.linux-x86_64-3.7/src/cuda/deform_conv_cuda.o.d -pthread -B /root/anaconda3/envs/ground_aware/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/usr/local/cuda/include -fPIC -DWITH_CUDA -I/root/anaconda3/envs/ground_aware/lib/python3.7/site-packages/torch/include -I/root/anaconda3/envs/ground_aware/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -I/root/anaconda3/envs/ground_aware/lib/python3.7/site-packages/torch/include/TH -I/root/anaconda3/envs/ground_aware/lib/python3.7/site-packages/torch/include/THC -I/usr/local/cuda/include -I/root/anaconda3/envs/ground_aware/include/python3.7m -c -c /zyf/code/visualDet3D/visualDet3D/networks/lib/ops/dcn/src/cuda/deform_conv_cuda.cpp -o /zyf/code/visualDet3D/visualDet3D/networks/lib/ops/dcn/build/temp.linux-x86_64-3.7/src/cuda/deform_conv_cuda.o -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=deform_conv_ext -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++14 cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++ In file included from /root/anaconda3/envs/ground_aware/lib/python3.7/site-packages/torch/include/ATen/Parallel.h:140:0, from /root/anaconda3/envs/ground_aware/lib/python3.7/site-packages/torch/include/torch/csrc/api/include/torch/utils.h:3, from /root/anaconda3/envs/ground_aware/lib/python3.7/site-packages/torch/include/torch/csrc/api/include/torch/nn/cloneable.h:5, from /root/anaconda3/envs/ground_aware/lib/python3.7/site-packages/torch/include/torch/csrc/api/include/torch/nn.h:3, from /root/anaconda3/envs/ground_aware/lib/python3.7/site-packages/torch/include/torch/csrc/api/include/torch/all.h:13, from /root/anaconda3/envs/ground_aware/lib/python3.7/site-packages/torch/include/torch/extension.h:4, from /zyf/code/visualDet3D/visualDet3D/networks/lib/ops/dcn/src/cuda/deform_conv_cuda.cpp:4: /root/anaconda3/envs/ground_aware/lib/python3.7/site-packages/torch/include/ATen/ParallelOpenMP.h:83:0: warning: ignoring #pragma omp parallel [-Wunknown-pragmas] #pragma omp parallel for if ((end - begin) >= grain_size)

[3/3] c++ -MMD -MF /zyf/code/visualDet3D/visualDet3D/networks/lib/ops/dcn/build/temp.linux-x86_64-3.7/src/deform_conv_ext.o.d -pthread -B /root/anaconda3/envs/ground_aware/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/usr/local/cuda/include -fPIC -DWITH_CUDA -I/root/anaconda3/envs/ground_aware/lib/python3.7/site-packages/torch/include -I/root/anaconda3/envs/ground_aware/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -I/root/anaconda3/envs/ground_aware/lib/python3.7/site-packages/torch/include/TH -I/root/anaconda3/envs/ground_aware/lib/python3.7/site-packages/torch/include/THC -I/usr/local/cuda/include -I/root/anaconda3/envs/ground_aware/include/python3.7m -c -c /zyf/code/visualDet3D/visualDet3D/networks/lib/ops/dcn/src/deform_conv_ext.cpp -o /zyf/code/visualDet3D/visualDet3D/networks/lib/ops/dcn/build/temp.linux-x86_64-3.7/src/deform_conv_ext.o -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=deform_conv_ext -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++14 cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++ In file included from /root/anaconda3/envs/ground_aware/lib/python3.7/site-packages/torch/include/ATen/Parallel.h:140:0, from /root/anaconda3/envs/ground_aware/lib/python3.7/site-packages/torch/include/torch/csrc/api/include/torch/utils.h:3, from /root/anaconda3/envs/ground_aware/lib/python3.7/site-packages/torch/include/torch/csrc/api/include/torch/nn/cloneable.h:5, from /root/anaconda3/envs/ground_aware/lib/python3.7/site-packages/torch/include/torch/csrc/api/include/torch/nn.h:3, from /root/anaconda3/envs/ground_aware/lib/python3.7/site-packages/torch/include/torch/csrc/api/include/torch/all.h:13, from /root/anaconda3/envs/ground_aware/lib/python3.7/site-packages/torch/include/torch/extension.h:4, from /zyf/code/visualDet3D/visualDet3D/networks/lib/ops/dcn/src/deform_conv_ext.cpp:4: /root/anaconda3/envs/ground_aware/lib/python3.7/site-packages/torch/include/ATen/ParallelOpenMP.h:83:0: warning: ignoring #pragma omp parallel [-Wunknown-pragmas] #pragma omp parallel for if ((end - begin) >= grain_size)

ninja: build stopped: subcommand failed. Traceback (most recent call last): File "/root/anaconda3/envs/ground_aware/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1673, in _run_ninja_build env=env) File "/root/anaconda3/envs/ground_aware/lib/python3.7/subprocess.py", line 512, in run output=stdout, stderr=stderr) subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "setup.py", line 198, in zip_safe=False) File "/root/anaconda3/envs/ground_aware/lib/python3.7/site-packages/setuptools/init.py", line 153, in setup return distutils.core.setup(**attrs) File "/root/anaconda3/envs/ground_aware/lib/python3.7/distutils/core.py", line 148, in setup dist.run_commands() File "/root/anaconda3/envs/ground_aware/lib/python3.7/distutils/dist.py", line 966, in run_commands self.run_command(cmd) File "/root/anaconda3/envs/ground_aware/lib/python3.7/distutils/dist.py", line 985, in run_command cmd_obj.run() File "/root/anaconda3/envs/ground_aware/lib/python3.7/site-packages/setuptools/command/build_ext.py", line 79, in run _build_ext.run(self) File "/root/anaconda3/envs/ground_aware/lib/python3.7/site-packages/Cython/Distutils/old_build_ext.py", line 186, in run _build_ext.build_ext.run(self) File "/root/anaconda3/envs/ground_aware/lib/python3.7/distutils/command/build_ext.py", line 340, in run self.build_extensions() File "/root/anaconda3/envs/ground_aware/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 708, in build_extensions build_ext.build_extensions(self) File "/root/anaconda3/envs/ground_aware/lib/python3.7/site-packages/Cython/Distutils/old_build_ext.py", line 195, in build_extensions _build_ext.build_ext.build_extensions(self) File "/root/anaconda3/envs/ground_aware/lib/python3.7/distutils/command/build_ext.py", line 449, in build_extensions self._build_extensions_serial() File "/root/anaconda3/envs/ground_aware/lib/python3.7/distutils/command/build_ext.py", line 474, in _build_extensions_serial self.build_extension(ext) File "/root/anaconda3/envs/ground_aware/lib/python3.7/site-packages/setuptools/command/build_ext.py", line 196, in build_extension _build_ext.build_extension(self, ext) File "/root/anaconda3/envs/ground_aware/lib/python3.7/distutils/command/build_ext.py", line 534, in build_extension depends=ext.depends) File "/root/anaconda3/envs/ground_aware/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 538, in unix_wrap_ninja_compile with_cuda=with_cuda) File "/root/anaconda3/envs/ground_aware/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1359, in _write_ninja_file_and_compile_objects error_prefix='Error compiling objects for extension') File "/root/anaconda3/envs/ground_aware/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1683, in _run_ninja_build raise RuntimeError(message) from e RuntimeError: Error compiling objects for extension

opened by AromaticJ 4
Change the dataset

Hi Thanks to your code, I can learn many things. Thank you.

What I'm curious about is changing data. I want to do the training only left image and the test data right image in a monocular 3D

Which part should I modify to execute the code? I'll be waiting for the reply. Thank you.

opened by sjg02122 4
Preprocessing another data

Hi @Owen-Liuyuxuan, Thanks for your great work, I have a problem when I inference my custom data. My model use your code to run KITTI data quite well but my custom data have sharp image quite different from the KITTI dataset (1280, 1980). I can see in your code have processing data code in the ./VisualDet3D/kitti/data directory but I don't know how to use it to process my custom data. Can you give me some advice to use it? Thanks

opened by tuclen-3 7
Dockerfile

Hello,

Thanks for making this work available publicly. I have been having issues trying to get it up and running, most likely due to version mismatch. I have tried with CUDA 11.6 and CUDA 11.7, PyTorch 1.12, numba 0.56.

Would you be able to either provide a Dockerfile, or exact set of packages with their versions that has worked well?

Thanks, Shubham

opened by towardsautonomy 1
Is YOLOStereo3D aware of the stereo baseline?

Hi, I wonder whether the stereo baseline is used in YOLOStereo3D. If I want to use YOLOStereo3D trained on KITTI to perform object detection in another scenario where the stereo baseline is different from KITTI, will it work? Which part should I modify?

Thanks a lot!

opened by ootts 1

Releases(1.1.1)

1.1.1(Dec 11, 2021)

We provide an Unofficial re-implementation of Digging Into Output Representation For Monocular 3D Object Detection (Digging_M3D) to introduce a simple but important numerical trick to significantly improve the KITTI mAP scores and make a significant change to the KITTI leaderboard. Details can be found in the paper. At the time of the open-source, the paper has not been officially published, and we will keep up with the update of the paper.
Source code(tar.gz)
Source code(zip)
1.1(Mar 18, 2021)

Pretrained model for Ground-aware Monocular 3D Object Detection for Autonomous Driving.

the model file could be placed under workdirs/Stereo3D/checkpoint/ (you should provide the path to the model file in the command line)

anchor_mean/std_Car/Pedestrian.npy should be placed under workdirs/Yolo3D/output/training. You can reproduce the npy file with the scripts runned on the 'test split'.

Backward Incompatibility:

We update the function for converting between observation angle (alpha) and 3D rotation angle (theta) following the more accurate version from RTM3D. It will break the result of the models in the previous release.

And we have to retrain a new YOLOStereo3D model to adapt to this change. So the released model performs slightly differently from the KITTI one.

Notice: To get similar performance on the test-split, you need to train for more epochs (80 epochs test for example), while you only need about 50 epochs to get a saturated performance on validation split (empirically with the current learning rate settings).

| Benchmark | Easy | Moderate | Hard | |---------------------|:--------:|:-------:|:-------:| | Car Detection | 94.75 % | 84.50 %| 62.13 % | | Car Orientation | 93.65 % |82.88 % | 60.92 % | | Car 3D Detection | 65.77 % | 40.71 % | 29.99 % | | Car Bird's Eye View | 74.00 % | 49.54 % | 36.30 % | | Pedestrian Detection | 58.34 % | 49.54 %| 36.30 % | | Pedestrian Orientation | 50.41 % | 36.81 % | 31.51 % | | Pedestrian 3D Detection | 31.03 % | 20.67 % | 18.34 % | | Pedestrian Bird's Eye View | 32.52 % | 22.74 % | 19.16 % |
Source code(tar.gz)
Source code(zip)
anchor_mean_Car.npy(2.37 KB)
anchor_mean_Pedestrian.npy(2.37 KB)
anchor_std_Car.npy(2.37 KB)
anchor_std_Pedestrian.npy(2.37 KB)
Stereo3D_latest.pth(410.59 MB)
1.0(Feb 1, 2021)

Pretrained model for Ground-aware Monocular 3D Object Detection for Autonomous Driving.

the model file could be placed under workdirs/Yolo3D/checkpoint/ (you should provide the path to the model file in the command line)

anchor_mean/std_Car.npy should be placed under workdirs/Yolo3D/output/training. You can reproduce the npy file with the scripts runned on the 'test split'.

| Benchmark | Easy | Moderate | Hard | |---------------------|:--------:|:-------:|:-------:| | Car Detection | 92.35 % | 79.57 %| 59.61 % | | Car Orientation | 90.87 % |77.47 % | 57.99 % | | Car 3D Detection | 21.60 % | 13.17 % | 9.94 % | | Car Bird's Eye View | 29.38 % | 18.00 % | 13.14 % |
Source code(tar.gz)
Source code(zip)
anchor_mean_Car.npy(1.62 KB)
anchor_std_Car.npy(1.62 KB)
data.zip(2.35 MB)
GroundAware_pretrained.pth(223.16 MB)

Owner

Yuxuan Liu

Ph.D. Student in ram-lab under supervision from Prof Ming Liu

GitHub Repository https://owen-liuyuxuan.github.io/papers_reading_sharing.github.io/3dDetection/GroundAwareConvultion/

Puzzle-CAM: Improved localization via matching partial and full features.

Puzzle-CAM The official implementation of "Puzzle-CAM: Improved localization via matching partial and full features".

150 Nov 14, 2022

Pytorch Lightning code guideline for conferences

Deep learning project seed Use this seed to start new deep learning / ML projects. Built in setup.py Built in requirements Examples with MNIST Badges

1k Jan 02, 2023

Weighted QMIX: Expanding Monotonic Value Function Factorisation

This repo contains the cleaned-up code that was used in "Weighted QMIX: Expanding Monotonic Value Function Factorisation"

82 Dec 29, 2022

[ICCV 2021] Encoder-decoder with Multi-level Attention for 3D Human Shape and Pose Estimation

MAED: Encoder-decoder with Multi-level Attention for 3D Human Shape and Pose Estimation Getting Started Our codes are implemented and tested with pyth

176 Dec 15, 2022

Adaptive FNO transformer - official Pytorch implementation

Adaptive Fourier Neural Operators: Efficient Token Mixers for Transformers This repository contains PyTorch implementation of the Adaptive Fourier Neu

77 Dec 29, 2022

The official PyTorch implementation of the paper: Xili Dai, Xiaojun Yuan, Haigang Gong, Yi Ma. "Fully Convolutional Line Parsing." .

F-Clip — Fully Convolutional Line Parsing This repository contains the official PyTorch implementation of the paper: *Xili Dai, Xiaojun Yuan, Haigang

115 Dec 28, 2022

This repository contains the code for TABS, a 3D CNN-Transformer hybrid automated brain tissue segmentation algorithm using T1w structural MRI scans

This repository contains the code for TABS, a 3D CNN-Transformer hybrid automated brain tissue segmentation algorithm using T1w structural MRI scans. TABS relies on a Res-Unet backbone, with a Vision

6 Nov 07, 2022

Official Repo for Ground-aware Monocular 3D Object Detection for Autonomous Driving

Related tags

Overview

Visual 3D Detection Package:

Related Paper:

Key Features

Setup

Environment setup.

Start Training

Config and Path setup.

Further Info and Bug Issues

Other Resources

Related Codes

Comments

Compute image database and anchors mean/std

You can run ./launcher/det_precompute.sh without arguments to see helper documents

Releases(1.1.1)

1.1.1(Dec 11, 2021)

1.1(Mar 18, 2021)

1.0(Feb 1, 2021)

Owner

Yuxuan Liu

Puzzle-CAM: Improved localization via matching partial and full features.

Pytorch Lightning code guideline for conferences

Weighted QMIX: Expanding Monotonic Value Function Factorisation

[ICCV 2021] Encoder-decoder with Multi-level Attention for 3D Human Shape and Pose Estimation

Adaptive FNO transformer - official Pytorch implementation

The official PyTorch implementation of the paper: *Xili Dai, Xiaojun Yuan, Haigang Gong, Yi Ma. "Fully Convolutional Line Parsing." *.

This repository contains the code for TABS, a 3D CNN-Transformer hybrid automated brain tissue segmentation algorithm using T1w structural MRI scans

[ICCV 2021] Learning A Single Network for Scale-Arbitrary Super-Resolution

Txt2Xml tool will help you convert from txt COCO format to VOC xml format in Object Detection Problem.

Object detection, 3D detection, and pose estimation using center point detection:

Prometheus exporter for Cisco Unified Computing System (UCS) Manager

Diverse Image Generation via Self-Conditioned GANs

Code and real data for the paper "Counterfactual Temporal Point Processes", available at arXiv.

A TensorFlow implementation of SOFA, the Simulator for OFfline LeArning and evaluation.

(NeurIPS 2021) Pytorch implementation of paper "Re-ranking for image retrieval and transductive few-shot classification"

Camera-caps - Examine the camera capabilities for V4l2 cameras

Exploring the Dual-task Correlation for Pose Guided Person Image Generation

[ICME 2021 Oral] CORE-Text: Improving Scene Text Detection with Contrastive Relational Reasoning

Neural networks applied in recognizing guitar chords using python, AutoML.NET with C# and .NET Core

[ECCV 2020] Reimplementation of 3DDFAv2, including face mesh, head pose, landmarks, and more.

The official PyTorch implementation of the paper: Xili Dai, Xiaojun Yuan, Haigang Gong, Yi Ma. "Fully Convolutional Line Parsing." .