MXNet implementation for: Drop an Octave: Reducing Spatial Redundancy in Convolutional Neural Networks with Octave Convolution

Related tags

Deep LearningOctConv
Overview

Octave Convolution

MXNet implementation for:

Drop an Octave: Reducing Spatial Redundancy in Convolutional Neural Networks with Octave Convolution

ImageNet

Ablation

  • Loss: Softmax
  • Learning rate: Cosine (warm-up: 5 epochs, lr: 0.4)
  • MXNet API: Symbol API

example

Model baseline alpha = 0.125 alpha = 0.25 alpha = 0.5 alpha = 0.75
DenseNet-121 75.4 / 92.7 76.1 / 93.0 75.9 / 93.1 -- --
ResNet-26 73.2 / 91.3 75.8 / 92.6 76.1 / 92.6 75.5 / 92.5 74.6 / 92.1
ResNet-50 77.0 / 93.4 78.2 / 93.9 78.0 / 93.8 77.4 / 93.6 76.7 / 93.0
SE-ResNet-50 77.6 / 93.6 78.7 / 94.1 78.4 / 94.0 77.9 / 93.8 77.4 / 93.5
ResNeXt-50 78.4 / 94.0 -- 78.8 / 94.2 78.4 / 94.0 77.5 / 93.6
ResNet-101 78.5 / 94.1 79.2 / 94.4 79.2 / 94.4 78.7 / 94.1 --
ResNeXt-101 79.4 / 94.6 -- 79.6 / 94.5 78.9 / 94.4 --
ResNet-200 79.6 / 94.7 80.0 / 94.9 79.8 / 94.8 79.5 / 94.7 --

Note:

  • Top-1 / Top-5, single center crop accuracy is shown in the table. (testing script)
  • All residual networks in ablation study adopt pre-actice version[1] for convenience.

Others

  • Learning rate: Cosine (warm-up: 5 epochs, lr: 0.4)
  • MXNet API: Gluon API
Model alpha label smoothing[2] mixup[3] #Params #FLOPs Top1 / Top5
0.75 MobileNet (v1) .375 2.6 M 213 M 70.5 / 89.5
1.0 MobileNet (v1) .5 4.2 M 321 M 72.5 / 90.6
1.0 MobileNet (v2) .375 Yes 3.5 M 256 M 72.0 / 90.7
1.125 MobileNet (v2) .5 Yes 4.2 M 295 M 73.0 / 91.2
Oct-ResNet-152 .125 Yes Yes 60.2 M 10.9 G 81.4 / 95.4
Oct-ResNet-152 + SE .125 Yes Yes 66.8 M 10.9 G 81.6 / 95.7

Citation

@article{chen2019drop,
  title={Drop an Octave: Reducing Spatial Redundancy in Convolutional Neural Networks with Octave Convolution},
  author={Chen, Yunpeng and Fan, Haoqi and Xu, Bing and Yan, Zhicheng and Kalantidis, Yannis and Rohrbach, Marcus and Yan, Shuicheng and Feng, Jiashi},
  journal={Proceedings of the IEEE International Conference on Computer Vision},
  year={2019}
}

Third-party Implementations

Acknowledgement

  • Thanks MXNet, Gluon-CV and TVM!
  • Thanks @Ldpe2G for sharing the code for calculating the #FLOPs (link)
  • Thanks Min Lin (Mila), Xin Zhao (Qihoo Inc.), Tao Wang (NUS) for helpful discussions on the code development.

Reference

[1] He K, et al "Identity Mappings in Deep Residual Networks".

[2] Christian S, et al "Rethinking the Inception Architecture for Computer Vision"

[3] Zhang H, et al. "mixup: Beyond empirical risk minimization.".

License

The code and the models are MIT licensed, as found in the LICENSE file.

Owner
Meta Research
Meta Research
CAPRI: Context-Aware Interpretable Point-of-Interest Recommendation Framework

CAPRI: Context-Aware Interpretable Point-of-Interest Recommendation Framework This repository contains a framework for Recommender Systems (RecSys), a

RecSys Lab 8 Jul 03, 2022
Face Recognize System on camera AI OAK1

FRS on OAK1 Face Recognize System on camera OAK1 This project contains our work that deploy on camera OAK1 Features Anti-Spoofing Face detection Face

Tran Anh Tuan 6 Aug 08, 2022
Tools for manipulating UVs in the Blender viewport.

UV Tool Suite for Blender A set of tools to make editing UVs easier in Blender. These tools can be accessed wither through the Kitfox - UV panel on th

35 Oct 29, 2022
AdaFocus V2: End-to-End Training of Spatial Dynamic Networks for Video Recognition

AdaFocusV2 This repo contains the official code and pre-trained models for AdaFo

79 Dec 26, 2022
High dimensional black-box optimizer using Latent Action Monte Carlo Tree Search algorithm

LA-MCTS The code is based of paper Learning Search Space Partition for Black-box Optimization using Monte Carlo Tree Search. Component LA-MCTS has thr

Meta Research 18 Oct 24, 2022
EgGateWayGetShell py脚本

EgGateWayGetShell_py 免责声明 由于传播、利用此文所提供的信息而造成的任何直接或者间接的后果及损失,均由使用者本人负责,作者不为此承担任何责任。 使用 python3 eg.py urls.txt 目标 title:锐捷网络-EWEB网管系统 port:4430 漏洞成因 ?p

榆木 61 Nov 09, 2022
PyTorch implementation of HDN(Homography Decomposition Networks) for planar object tracking

Homography Decomposition Networks for Planar Object Tracking This project is the offical PyTorch implementation of HDN(Homography Decomposition Networ

CaptainHook 48 Dec 15, 2022
CLIP + VQGAN / PixelDraw

clipit Yet Another VQGAN-CLIP Codebase This started as a fork of @nerdyrodent's VQGAN-CLIP code which was based on the notebooks of @RiversWithWings a

dribnet 276 Dec 12, 2022
Using Python to Play Cyberpunk 2077

CyberPython 2077 Using Python to Play Cyberpunk 2077 This repo will contain code from the Cyberpython 2077 video series on Youtube (youtube.

Harrison 118 Oct 18, 2022
Fedlearn支持前沿算法研发的Python工具库 | Fedlearn algorithm toolkit for researchers

FedLearn-algo Installation Development Environment Checklist python3 (3.6 or 3.7) is required. To configure and check the development environment is c

89 Nov 14, 2022
Towards Understanding Quality Challenges of the Federated Learning: A First Look from the Lens of Robustness

FL Analysis This repository contains the code and results for the paper "Towards Understanding Quality Challenges of the Federated Learning: A First L

3 Oct 17, 2022
pip install python-office

🍬 python for office 👉 http://www.python4office.cn/ 👈 🌎 English Documentation 📚 简介 Python-office 是一个 Python 自动化办公第三方库,能解决大部分自动化办公的问题。而且每个功能只需一行代码,

程序员晚枫 272 Dec 29, 2022
Pytorch ImageNet1k Loader with Bounding Boxes.

ImageNet 1K Bounding Boxes For some experiments, you might wanna pass only the background of imagenet images vs passing only the foreground. Here, I'v

Amin Ghiasi 11 Oct 15, 2022
ComPhy: Compositional Physical Reasoning ofObjects and Events from Videos

ComPhy This repository holds the code for the paper. ComPhy: Compositional Physical Reasoning ofObjects and Events from Videos, (Under review) PDF Pro

29 Dec 29, 2022
The 2nd place solution of 2021 google landmark retrieval on kaggle.

Google_Landmark_Retrieval_2021_2nd_Place_Solution The 2nd place solution of 2021 google landmark retrieval on kaggle. Environment We use cuda 11.1/pyt

229 Dec 13, 2022
Code for paper "Do Language Models Have Beliefs? Methods for Detecting, Updating, and Visualizing Model Beliefs"

This is the codebase for the paper: Do Language Models Have Beliefs? Methods for Detecting, Updating, and Visualizing Model Beliefs Directory Structur

Peter Hase 19 Aug 21, 2022
METS/ALTO OCR enhancing tool by the National Library of Luxembourg (BnL)

Nautilus-OCR The National Library of Luxembourg (BnL) started its first initiative in digitizing newspapers, with layout recognition and OCR on articl

National Library of Luxembourg 36 Dec 05, 2022
3D cascade RCNN for object detection on point cloud

3D Cascade RCNN This is the implementation of 3D Cascade RCNN: High Quality Object Detection in Point Clouds. We designed a 3D object detection model

Qi Cai 22 Dec 02, 2022
For storing the complete exploration of Visual Question Answering for our B.Tech Project

Multi-Image vqa @authors: Akhilesh, Janhavi, Harsh Paper summary, Ideas tried and their corresponding results: on wiki Other discussions: on discussio

Harsh Raj 3 Jun 16, 2022
Reducing Information Bottleneck for Weakly Supervised Semantic Segmentation (NeurIPS 2021)

Reducing Information Bottleneck for Weakly Supervised Semantic Segmentation (NeurIPS 2021) The implementation of Reducing Infromation Bottleneck for W

Jungbeom Lee 81 Dec 16, 2022