code for "AttentiveNAS Improving Neural Architecture Search via Attentive Sampling"

Last update: Oct 26, 2022

Related tags

Overview

AttentiveNAS: Improving Neural Architecture Search via Attentive Sampling

This repository contains PyTorch evaluation code, training code and pretrained models for AttentiveNAS.

For details see AttentiveNAS: Improving Neural Architecture Search via Attentive Sampling by Dilin Wang, Meng Li, Chengyue Gong and Vikas Chandra.

If you find this project useful in your research, please consider cite:

@article{wang2020attentivenas,
  title={AttentiveNAS: Improving Neural Architecture Search via Attentive Sampling},
  author={Wang, Dilin and Li, Meng and Gong, Chengyue and Chandra, Vikas},
  journal={arXiv preprint arXiv:2011.09011},
  year={2020}
}

Pretrained models and data

Download our pretrained AttentiveNAS models and a (sub-network, FLOPs) lookup table from Google Drive and put them under folder ./attentive_nas_data

Evaluation

To evaluate our pre-trained AttentiveNAS models, from AttentiveNAS-A0 to A6, on ImageNet val with a single GPU, run:

python test_attentive_nas.py --config-file ./configs/eval_attentive_nas_models.yml --model a[0-6]

Expected results:

Name	MFLOPs	Top-1 (%)
AttentiveNAS-A0	203	77.3
AttentiveNAS-A1	279	78.4
AttentiveNAS-A2	317	78.8
AttentiveNAS-A3	357	79.1
AttentiveNAS-A4	444	79.8
AttentiveNAS-A5	491	80.1
AttentiveNAS-A6	709	80.7

Training

To train our AttentiveNAS models from scratch, run

python train_supernet.py --config-file configs/train_attentive_nas_models.yml --machine-rank ${machine_rank} --num-machines ${num_machines} --dist-url ${dist_url}

We adopt SGD training on 64 GPUs. The mini-batch size is 32 per GPU; all training hyper-parameters are specified in train_attentive_nas_models.yml.

License

The majority of AttentiveNAS is licensed under CC-BY-NC, however portions of the project are available under separate license terms: Once For All is licensed under the Apache 2.0 license.

Contributing

We actively welcome your pull requests! Please see CONTRIBUTING and CODE_OF_CONDUCT for more info.

code for "AttentiveNAS Improving Neural Architecture Search via Attentive Sampling"

Related tags

Overview

AttentiveNAS: Improving Neural Architecture Search via Attentive Sampling

Pretrained models and data

Evaluation

Training

License

Contributing

Owner

Facebook Research

A large-scale (194k), Multiple-Choice Question Answering (MCQA) dataset designed to address realworld medical entrance exam questions.

NLP Core Library and Model Zoo based on PaddlePaddle 2.0

多语言降噪预训练模型MBart的中文生成任务

aMLP Transformer Model for Japanese

Jupyter Notebook tutorials on solving real-world problems with Machine Learning & Deep Learning using PyTorch

Legal text retrieval for python

Simple Speech to Text, Text to Speech

ACL'2021: Learning Dense Representations of Phrases at Scale

Live Speech Portraits: Real-Time Photorealistic Talking-Head Animation (SIGGRAPH Asia 2021)

To create a deep learning model which can explain the content of an image in the form of speech through caption generation with attention mechanism on Flickr8K dataset.

ThinkTwice: A Two-Stage Method for Long-Text Machine Reading Comprehension

text to speech toolkit. 好用的中文语音合成工具箱，包含语音编码器、语音合成器、声码器和可视化模块。

Tracking Progress in Natural Language Processing

Chinese segmentation library

Training open neural machine translation models

Adversarial Examples for Extreme Multilabel Text Classification

hashily is a Python module that provides a variety of text decoding and encoding operations.

A model library for exploring state-of-the-art deep learning topologies and techniques for optimizing Natural Language Processing neural networks

Treemap visualisation of Maya scene files

Maha is a text processing library specially developed to deal with Arabic text.