This is a collection of our NAS and Vision Transformer work.

Overview

AutoML - Neural Architecture Search

This is a collection of our AutoML-NAS work

iRPE (NEW): Rethinking and Improving Relative Position Encoding for Vision Transformer

AutoFormer (NEW): AutoFormer: Searching Transformers for Visual Recognition

Cream (@NeurIPS'20): Cream of the Crop: Distilling Prioritized Paths For One-Shot Neural Architecture Search

We also implemented our NAS algorithms on Microsoft NNI (Neural Network Intelligence).

News

  • โ˜€๏ธ Hiring research interns for neural architecture search, tiny transformer design, model compression projects: [email protected]
  • ๐Ÿ’ฅ Oct, 2021: AutoFormerV2 has been accepted by NeurIPS'21, will be released soon.
  • ๐Ÿ’ฅ Aug, 2021: Code for AutoFormer is now released.
  • ๐Ÿ’ฅ July, 2021: iRPE code (with CUDA Acceleration) is now released. Paper is here.
  • ๐Ÿ’ฅ July, 2021: iRPE has been accepted by ICCV'21.
  • ๐Ÿ’ฅ July, 2021: AutoFormer has been accepted by ICCV'21.
  • ๐Ÿ’ฅ July, 2021: AutoFormer is now available on arXiv.
  • ๐Ÿ’ฅ Oct, 2020: Code for Cream is now released.
  • ๐Ÿ’ฅ Oct, 2020: Cream was accepted to NeurIPS'20

Works

AutoFormer

AutoFormer is new one-shot architecture search framework dedicated to vision transformer search. It entangles the weights of different vision transformer blocks in the same layers during supernet training. Benefiting from the strategy, the trained supernet allows thousands of subnets to be very well-trained. Specifically, the performance of these subnets with weights inherited from the supernet is comparable to those retrained from scratch.

AutoFormer overview

iRPE

Image RPE (iRPE for short) methods are new relative position encoding methods dedicated to 2D images, considering directional relative distance modeling as well as the interactions between queries and relative position embeddings in self-attention mechanism. The proposed iRPE methods are simple and lightweight, being easily plugged into transformer blocks. Experiments demonstrate that solely due to the proposed encoding methods, DeiT and DETR obtain up to 1.5% (top-1 Acc) and 1.3% (mAP) stable improvements over their original versions on ImageNet and COCO respectively, without tuning any extra hyperparamters such as learning rate and weight decay. Our ablation and analysis also yield interesting findings, some of which run counter to previous understanding.

iRPE overview

Cream

[Paper] [Models-Google Drive][Models-Baidu Disk (password: wqw6)] [Slides] [BibTex]

In this work, we present a simple yet effective architecture distillation method. The central idea is that subnetworks can learn collaboratively and teach each other throughout the training process, aiming to boost the convergence of individual models. We introduce the concept of prioritized path, which refers to the architecture candidates exhibiting superior performance during training. Distilling knowledge from the prioritized paths is able to boost the training of subnetworks. Since the prioritized paths are changed on the fly depending on their performance and complexity, the final obtained paths are the cream of the crop.

Bibtex

@InProceedings{iRPE,
    author    = {Wu, Kan and Peng, Houwen and Chen, Minghao and Fu, Jianlong and Chao, Hongyang},
    title     = {Rethinking and Improving Relative Position Encoding for Vision Transformer},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    month     = {October},
    year      = {2021},
    pages     = {10033-10041}
}

@article{AutoFormer,
  title={AutoFormer: Searching Transformers for Visual Recognition},
  author={Chen, Minghao and Peng, Houwen and Fu, Jianlong and Ling, Haibin},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
  year={2021}
}

@article{Cream,
  title={Cream of the Crop: Distilling Prioritized Paths For One-Shot Neural Architecture Search},
  author={Peng, Houwen and Du, Hao and Yu, Hongyuan and Li, Qi and Liao, Jing and Fu, Jianlong},
  journal={Advances in Neural Information Processing Systems},
  volume={33},
  year={2020}
}

License

License under an MIT license.

Comments
  • Please open source the teacher logits

    Please open source the teacher logits

    Dear Authors,

    Very impressive work. For reproducibility purposes could you please share the teacher logits files for all the teachers shown in this paper?

    TinyViT 
    opened by sanyalsunny111 15
  • RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation:

    RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation:

    I encountered with a runtime error when I tried to search for an architecture based on your code.

    /opt/conda/conda-bld/pytorch_1565272279342/work/torch/csrc/autograd/python_anomaly_mode.cpp:57: UserWarning: Traceback of forward call that caused the error:
      File "tools/train.py", line 300, in <module>
        main()
      File "tools/train.py", line 259, in main
        est=model_est, local_rank=args.local_rank)
      File "/opt/tiger/cream/lib/core/train.py", line 55, in train_epoch
        output = model(input, random_cand)
      File "/home/tiger/.conda/envs/Cream/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in __call__
        result = self.forward(*input, **kwargs)
      File "/home/tiger/.conda/envs/Cream/lib/python3.6/site-packages/torch/nn/parallel/distributed.py", line 442, in forward
        output = self.module(*inputs[0], **kwargs[0])
      File "/home/tiger/.conda/envs/Cream/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in __call__
        result = self.forward(*input, **kwargs)
      File "/opt/tiger/cream/lib/models/structures/supernet.py", line 121, in forward
        x = self.forward_features(x, architecture)
      File "/opt/tiger/cream/lib/models/structures/supernet.py", line 113, in forward_features
        x = blocks[arch](x)
      File "/home/tiger/.conda/envs/Cream/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in __call__
        result = self.forward(*input, **kwargs)
      File "/home/tiger/.conda/envs/Cream/lib/python3.6/site-packages/timm/models/efficientnet_blocks.py", line 133, in forward
        x = self.bn1(x)
      File "/home/tiger/.conda/envs/Cream/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in __call__
        result = self.forward(*input, **kwargs)
      File "/home/tiger/.conda/envs/Cream/lib/python3.6/site-packages/torch/nn/modules/batchnorm.py", line 81, in forward
        exponential_average_factor, self.eps)
      File "/home/tiger/.conda/envs/Cream/lib/python3.6/site-packages/torch/nn/functional.py", line 1656, in batch_norm
        training, momentum, eps, torch.backends.cudnn.enabled
    
    Traceback (most recent call last):
      File "tools/train.py", line 300, in <module>
        main()
      File "tools/train.py", line 259, in main
        est=model_est, local_rank=args.local_rank)
      File "/opt/tiger/cream/lib/core/train.py", line 67, in train_epoch
        loss.backward()
      File "/home/tiger/.conda/envs/Cream/lib/python3.6/site-packages/torch/tensor.py", line 118, in backward
        torch.autograd.backward(self, gradient, retain_graph, create_graph)
      File "/home/tiger/.conda/envs/Cream/lib/python3.6/site-packages/torch/autograd/__init__.py", line 93, in backward
        allow_unreachable=True)  # allow_unreachable flag
    RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [320]] is at version 2507; expected version 2506 instead. Hint: the backtr
    ace further above shows the operation that failed to compute its gradient. The variable in question was changed in there or anywhere later. Good luck!
    

    I tried to locate the source of the error, and I find that whenever the code update the meta network or add the kd_loss to the final loss the error above appears. How can I fix this problem?

    opened by Ema1997 11
  • MiniVit: Some NCCL operations have failed or timed out

    MiniVit: Some NCCL operations have failed or timed out

    when I try to run Mini-Deit with 6 GPUs on the same node, the train stopped at some first several epoch, the error info like :

    [E ProcessGroupNCCL.cpp:587] [Rank 5] Watchdog caught collective operation timeout: WorkNCCL(OpType=ALLREDUCE, Timeout(ms)=1800000) ran for 1808699 milliseconds before timing out. [E ProcessGroupNCCL.cpp:587] [Rank 4] Watchdog caught collective operation timeout: WorkNCCL(OpType=ALLREDUCE, Timeout(ms)=1800000) ran for 1808705 milliseconds before timing out. [E ProcessGroupNCCL.cpp:587] [Rank 3] Watchdog caught collective operation timeout: WorkNCCL(OpType=ALLREDUCE, Timeout(ms)=1800000) ran for 1808703 milliseconds before timing out. [E ProcessGroupNCCL.cpp:587] [Rank 2] Watchdog caught collective operation timeout: WorkNCCL(OpType=ALLREDUCE, Timeout(ms)=1800000) ran for 1808731 milliseconds before timing out. [E ProcessGroupNCCL.cpp:587] [Rank 1] Watchdog caught collective operation timeout: WorkNCCL(OpType=ALLREDUCE, Timeout(ms)=1800000) ran for 1808750 milliseconds before timing out. [E ProcessGroupNCCL.cpp:587] [Rank 0] Watchdog caught collective operation timeout: WorkNCCL(OpType=ALLREDUCE, Timeout(ms)=1800000) ran for 1808749 milliseconds before timing out. [E ProcessGroupNCCL.cpp:341] Some NCCL operations have failed or timed out. Due to the asynchronous nature of CUDA kernels, subsequent GPU operations might run on corrupted/incomplete data. To avoid this inconsistency, we are taking the entire process down. [E ProcessGroupNCCL.cpp:341] Some NCCL operations have failed or timed out. Due to the asynchronous nature of CUDA kernels, subsequent GPU operations might run on corrupted/incomplete data. To avoid this inconsistency, we are taking the entire process down. [E ProcessGroupNCCL.cpp:341] Some NCCL operations have failed or timed out. Due to the asynchronous nature of CUDA kernels, subsequent GPU operations might run on corrupted/incomplete data. To avoid this inconsistency, we are taking the entire process down. [E ProcessGroupNCCL.cpp:341] Some NCCL operations have failed or timed out. Due to the asynchronous nature of CUDA kernels, subsequent GPU operations might run on corrupted/incomplete data. To avoid this inconsistency, we are taking the entire process down.

    Can you tell me what reasons may lead to such problem? Thank you a lot !

    MiniViT 
    opened by Ga-Lee 7
  • using  TinyVit_5m_224 for backbone  to train segmentation task

    using TinyVit_5m_224 for backbone to train segmentation task

    Hi, thanks for sharing your excellent work. I want to try to use TinyVit_5m_224 for backbone to train segmentation task which input size is 512x512. Need I changed the original weight because of different size? How can I do it ?

    TinyViT 
    opened by haoxurt 6
  • RuntimeError: CUDA error: invalid device function?

    RuntimeError: CUDA error: invalid device function?

    We have compiled the cuda version of IRPE module with the setup.py file in DETR-with-iRPE. When start to train the model,

    there is the issue:

      File "***/rpe_attention/rpe_attention_function.py", line 330, in rpe_multi_head_attention_forward
    attn_output_weights_view.add_(rpe_k(q_view, height=hw[0], width=hw[1]))
    

    RuntimeError: CUDA error: invalid device function

    The environment of our project is :

    pytorch:1.9.1
    python:3.8
    torchvision: 0.10.1
    cudatoolkit: 10.2.89
    

    I debug the train process , the main reason is that the output of function rep_k(*) and rep_q, can not perform add operation with attn_output_weights_view. could you give suggestion?

    iRPE 
    opened by chenfsjz 6
  • How to design the flops range FLOPS_MINIMUM and FLOPS_MAXIMUM to specify the desired model Flops?

    How to design the flops range FLOPS_MINIMUM and FLOPS_MAXIMUM to specify the desired model Flops?

    Hi, Thanks for your excellent work. As the title show, How to design the flops range FLOPS_MINIMUM and FLOPS_MAXIMUM to specify the desired model Flops? Since the flops_minimum and flops_maxmum will influence subnets and teacher network sampling, the target model 500M and 50M may have different choices?

    opened by sunnyxiaohu 6
  • FLOPs in the paper

    FLOPs in the paper

    hi, I have a question about the FLOPs reported in the paper. In Table 5, Cream-S got 287M FLOPs. But, I found it should be 318M FLOPs based on the architecture in the appendix.

    opened by GG-Bonds 6
  • Accuracy of the network on the 50000 test images: 0.1%

    Accuracy of the network on the 50000 test images: 0.1%

    Hello, the author, this is a very meaningful work. I encountered this accuracy problem when running the following code: python -m torch.distributed.launch --nproc_per_node 8 main.py --cfg configs/22k_distill/tiny_vit_5m_22k_distill.yaml --data-path ./ImageNet --batch-size 128 --eval --resume ./checkpoints/tiny_vit_5m_22k_distill.pth --opts DATA.DATASET imagenet Did I do something wrong?The dataset I use is ILSVRC2012. Is this what the project calls ImageNet? On the other hand, I would like to ask how to evaluate it on a computer only with CPU? Look forward to your reply, thank you!

    TinyViT 
    opened by DCBXZ66 4
  • About the teacher logits of the TinyViT

    About the teacher logits of the TinyViT

    Hi, thanks for sharing your excellent work. I am trying to utilize the script save_logits.py to generate the soft label for knowledge distillation. During generation I found that the binary file of the same epoch is different when using different starting epoch option. Although, using the script save_logits.py with the flag --check-saved-logits, there is no different or error occur. But I am wonder that why these different might come from?

    To rebuild the issue, we can generate the logit of a specific epoch with different start epoch setting.

    Thanks !

    TinyViT 
    opened by shadowpa0327 4
  • Questions about search space of Cream

    Questions about search space of Cream

    Hi, thank you for your great work! I am interested in Cream but I met some problems when reading both the paper and the source code.

    Question 1:

    In supernet.py:

      arch_def = [
          # stage 0, 112x112 in
          ['ds_r1_k3_s1_e1_c16_se0.25'],
          # stage 1, 112x112 in
          ['ir_r1_k3_s2_e4_c24_se0.25', 'ir_r1_k3_s1_e4_c24_se0.25', 'ir_r1_k3_s1_e4_c24_se0.25',
           'ir_r1_k3_s1_e4_c24_se0.25'],
          # stage 2, 56x56 in
          ['ir_r1_k5_s2_e4_c40_se0.25', 'ir_r1_k5_s1_e4_c40_se0.25', 'ir_r1_k5_s2_e4_c40_se0.25',
           'ir_r1_k5_s2_e4_c40_se0.25'],
          # stage 3, 28x28 in
          ['ir_r1_k3_s2_e6_c80_se0.25', 'ir_r1_k3_s1_e4_c80_se0.25', 'ir_r1_k3_s1_e4_c80_se0.25',
           'ir_r2_k3_s1_e4_c80_se0.25'],
          # stage 4, 14x14in
          ['ir_r1_k3_s1_e6_c96_se0.25', 'ir_r1_k3_s1_e6_c96_se0.25', 'ir_r1_k3_s1_e6_c96_se0.25',
           'ir_r1_k3_s1_e6_c96_se0.25'],
          # stage 5, 14x14in
          ['ir_r1_k5_s2_e6_c192_se0.25', 'ir_r1_k5_s1_e6_c192_se0.25', 'ir_r1_k5_s2_e6_c192_se0.25',
           'ir_r1_k5_s2_e6_c192_se0.25'],
          # stage 6, 7x7 in
          ['cn_r1_k1_s1_c320_se0.25'],
      ]
    

    There are specific numbers of blocks for each stage. In this case, there are 4,4,5,4,4 blocks.

    However, in your paper, the repeat number is ranging from 4 to 6. The code doesn't match the description in the paper.

    image

    Question 2:

    Another question is about skip connection operation. In your paper, the description is as below:

    image

    But I can not find the Skip Connection in your search space.

    https://github.com/microsoft/Cream/blob/bd49e0b933eeb60147f2b3bcecf241f06d92b435/Cream/lib/models/builders/build_supernet.py#L34

    There are only 6 operations in your Search Space.

    cream 
    opened by pprp 4
  • Some wrongs with nvcc fatal   : Unsupported gpu architecture 'compute_86'

    Some wrongs with nvcc fatal : Unsupported gpu architecture 'compute_86'

    Hi, thanks for your great work. But I have some problems when I run the following command: cd /rpe_ops python setup.py install --user

    FAILED: /home/UserDirectory/hongshengz/Stark-main/lib/models/stark/rpe_ops/build/temp.linux-x86_64-3.8/rpe_index_cuda.o /usr/bin/nvcc --generate-dependencies-with-compile --dependency-output /home/UserDirectory/hongshengz/Stark-main/lib/models/stark/rpe_ops/build/temp.linux-x86_64-3.8/rpe_index_cuda.o.d -DWITH_CUDA -I/home/UserDirectory/hongshengz/anaconda3/lib/python3.8/site-packages/torch/include -I/home/UserDirectory/hongshengz/anaconda3/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -I/home/UserDirectory/hongshengz/anaconda3/lib/python3.8/site-packages/torch/include/TH -I/home/UserDirectory/hongshengz/anaconda3/lib/python3.8/site-packages/torch/include/THC -I/home/UserDirectory/hongshengz/anaconda3/include/python3.8 -c -c /home/UserDirectory/hongshengz/Stark-main/lib/models/stark/rpe_ops/rpe_index_cuda.cu -o /home/UserDirectory/hongshengz/Stark-main/lib/models/stark/rpe_ops/build/temp.linux-x86_64-3.8/rpe_index_cuda.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O3 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=rpe_index_cpp -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 -std=c++14 nvcc fatal : Unsupported gpu architecture 'compute_86'

    My environment is: RTX3090 CUDA:11.4
    torch_version: 1.8.1

    iRPE 
    opened by hongsheng-Z 4
  • Loss Nan for AutoFormer Base Model

    Loss Nan for AutoFormer Base Model

    Thanks for your work! I try to reproduce the base model for Autoformer, but met the problem that loss might be nan during 200th ~ 300th epochs. Do you have any idea to solve this problem?

    AutoFormer 
    opened by rehulisw 1
  • Rethinking and Improving Relative Position Encoding for Vision Transformer with memory optimized attentions

    Rethinking and Improving Relative Position Encoding for Vision Transformer with memory optimized attentions

    Hello I was wondering whether your relative positional encoding schemes would work with approximate attention mechanisms for example like presented in flash attention https://arxiv.org/abs/2205.14135

    iRPE 
    opened by jakubMitura14 1
  • Model architecture search in TinyViT framework

    Model architecture search in TinyViT framework

    I have tried finding the search algorithm to find tinier versions of the parent model, using "constrained local search" as mentioned in the paper for reproducing your work.

    Could you release the search algorithm where you have used the progressive model contraction approach to find better architectures with good performance?

    TinyViT 
    opened by NKSagarReddy 3
  • Maybe the potential bug in autoformer

    Maybe the potential bug in autoformer

    https://github.com/microsoft/Cream/blob/a857830192d472e6776e9af4bbd988f35ebf1f4d/AutoFormer/model/module/qkv_super.py#L72-L83

    In the qkv_super the weight and bias sharing strategy is different. I think the selection of bias is unreasonable and should be modified in the following way.

     def sample_bias(bias, sample_out_dim): 
         sample_bias = torch.cat([sample_bias [i:sample_out_dim:3, :] for i in range(3)], dim =0) 
      
         return sample_bias 
    
    
    opened by crj1998 0
Releases(static_files)
Owner
Microsoft
Open source projects and samples from Microsoft
Microsoft
๐Ÿ”ฅ3D-RecGAN in Tensorflow (ICCV Workshops 2017)

3D Object Reconstruction from a Single Depth View with Adversarial Learning Bo Yang, Hongkai Wen, Sen Wang, Ronald Clark, Andrew Markham, Niki Trigoni

Bo Yang 125 Nov 26, 2022
The official repository for Deep Image Matting with Flexible Guidance Input

FGI-Matting The official repository for Deep Image Matting with Flexible Guidance Input. Paper: https://arxiv.org/abs/2110.10898 Requirements easydict

Hang Cheng 51 Nov 10, 2022
๊ณต๊ณต์žฅ์†Œ์—์„œ ๋ˆˆ๋งŒ ๋Œ๋ฆฌ๋ฉด CCTV๊ฐ€ ๋ณด์ธ๋‹ค๋Š” ๋ง์ด ๊ณผ์–ธ์ด ์•„๋‹ ์ •๋„๋กœ CCTV๊ฐ€ ์šฐ๋ฆฌ ์ƒํ™œ์— ๊นŠ์ˆ™์ด ์ž๋ฆฌ ์žก์•˜์Šต๋‹ˆ๋‹ค.

ObsCare_Main ์†Œ๊ฐœ ๊ณต๊ณต์žฅ์†Œ์—์„œ ๋ˆˆ๋งŒ ๋Œ๋ฆฌ๋ฉด CCTV๊ฐ€ ๋ณด์ธ๋‹ค๋Š” ๋ง์ด ๊ณผ์–ธ์ด ์•„๋‹ ์ •๋„๋กœ CCTV๊ฐ€ ์šฐ๋ฆฌ ์ƒํ™œ์— ๊นŠ์ˆ™์ด ์ž๋ฆฌ ์žก์•˜์Šต๋‹ˆ๋‹ค. CCTV์˜ ๋Œ€์ˆ˜๊ฐ€ ๊ธ‰๊ฒฉํžˆ ๋Š˜์–ด๋‚˜๋ฉด์„œ ๊ด€๋ฆฌ์™€ ํšจ์œจ์„ฑ ๋ฌธ์ œ์™€ ๋”๋ถˆ์–ด, ๊ณณ๊ณณ์— ์„ค์น˜๋œ CCTV๋ฅผ ๊ฐœ๋ณ„ ๊ด€์ œํ•˜๋Š” ๊ฒƒ์œผ๋กœ๋Š” ์‘๊ธ‰ ์ƒ

5 Jul 07, 2022
African language Speech Recognition - Speech-to-Text

Swahili-Speech-To-Text Table of Contents Swahili-Speech-To-Text Overview Scenario Approach Project Structure data: models: notebooks: scripts tests: l

2 Jan 05, 2023
Pytorch implementation of

EfficientTTS Unofficial Pytorch implementation of "EfficientTTS: An Efficient and High-Quality Text-to-Speech Architecture"(arXiv). Disclaimer: Somebo

Liu Songxiang 109 Nov 16, 2022
MediaPipe Kullanarak ฤฐleri Seviye Bilgisayarla Gรถrรผ

MediaPipe Kullanarak ฤฐleri Seviye Bilgisayarla Gรถrรผ

Burak Bagatarhan 12 Mar 29, 2022
A strongly-typed genetic programming framework for Python

monkeys "If an army of monkeys were strumming on typewriters they might write all the books in the British Museum." monkeys is a framework designed to

H. Chase Stevens 115 Nov 27, 2022
Raindrop strategy for Irregular time series

Graph-Guided Network For Irregularly Sampled Multivariate Time Series Overview This repository contains processed datasets and implementation code for

Zitnik Lab @ Harvard 74 Jan 03, 2023
Simple Pixelbot for Diablo 2 Resurrected written in python and opencv.

Simple Pixelbot for Diablo 2 Resurrected written in python and opencv. Obviously only use it in offline mode as it is against the TOS of Blizzard to use it in online mode!

468 Jan 03, 2023
Keep CALM and Improve Visual Feature Attribution

Keep CALM and Improve Visual Feature Attribution Jae Myung Kim1*, Junsuk Choe1*, Zeynep Akata2, Seong Joon Oh1โ€  * Equal contribution โ€  Corresponding a

NAVER AI 90 Dec 07, 2022
Implement object segmentation on images using HOG algorithm proposed in CVPR 2005

HOG Algorithm Implementation Description HOG (Histograms of Oriented Gradients) Algorithm is an algorithm aiming to realize object segmentation (edge

Leo Hsieh 2 Mar 12, 2022
This project is the PyTorch implementation of our CVPR 2022 paper:

Requirements and Dependency Install PyTorch with CUDA (for GPU). (Experiments are validated on python 3.8.11 and pytorch 1.7.0) (For visualization if

Lei Huang 23 Nov 29, 2022
Code for ACM MM 2020 paper "NOH-NMS: Improving Pedestrian Detection by Nearby Objects Hallucination"

NOH-NMS: Improving Pedestrian Detection by Nearby Objects Hallucination The offical implementation for the "NOH-NMS: Improving Pedestrian Detection by

Tencent YouTu Research 64 Nov 11, 2022
Multiple custom object count and detection using YOLOv3-Tiny method

Electronic-Component-YOLOv3 Introduce This project created to detect, count, and recognize multiple custom object using YOLOv3-Tiny method. The target

Derwin Mahardika 2 Nov 14, 2022
PGPortfolio: Policy Gradient Portfolio, the source code of "A Deep Reinforcement Learning Framework for the Financial Portfolio Management Problem"(https://arxiv.org/pdf/1706.10059.pdf).

This is the original implementation of our paper, A Deep Reinforcement Learning Framework for the Financial Portfolio Management Problem (arXiv:1706.1

Zhengyao Jiang 1.5k Dec 29, 2022
Implementation of the paper "Shapley Explanation Networks"

Shapley Explanation Networks Implementation of the paper "Shapley Explanation Networks" at ICLR 2021. Note that this repo heavily uses the experimenta

68 Dec 27, 2022
Pytorch implementation of Distributed Proximal Policy Optimization: https://arxiv.org/abs/1707.02286

Pytorch-DPPO Pytorch implementation of Distributed Proximal Policy Optimization: https://arxiv.org/abs/1707.02286 Using PPO with clip loss (from https

Alexis David Jacq 163 Dec 26, 2022
HSC4D: Human-centered 4D Scene Capture in Large-scale Indoor-outdoor Space Using Wearable IMUs and LiDAR. CVPR 2022

HSC4D: Human-centered 4D Scene Capture in Large-scale Indoor-outdoor Space Using Wearable IMUs and LiDAR. CVPR 2022 [Project page | Video] Getting sta

51 Nov 29, 2022
Official PyTorch implementation of Segmenter: Transformer for Semantic Segmentation

Segmenter: Transformer for Semantic Segmentation Segmenter: Transformer for Semantic Segmentation by Robin Strudel*, Ricardo Garcia*, Ivan Laptev and

594 Jan 06, 2023
Distilling Motion Planner Augmented Policies into Visual Control Policies for Robot Manipulation (CoRL 2021)

Distilling Motion Planner Augmented Policies into Visual Control Policies for Robot Manipulation [Project website] [Paper] This project is a PyTorch i

Cognitive Learning for Vision and Robotics (CLVR) lab @ USC 6 Feb 28, 2022