SLIDE : In Defense of Smart Algorithms over Hardware Acceleration for Large-Scale Deep Learning Systems

Overview

SLIDE

The SLIDE package contains the source code for reproducing the main experiments in this paper.

Dataset

The Datasets can be downloaded in Amazon-670K. Note that the data is sorted by labels so please shuffle at least the validation/testing data.

TensorFlow Baselines

We suggest directly get TensorFlow docker image to install TensorFlow-GPU. For TensorFlow-CPU compiled with AVX2, we recommend using this precompiled build.

Also there is a TensorFlow docker image specifically built for CPUs with AVX-512 instructions, to get it use:

docker pull clearlinux/stacks-dlrs_2-mkl    

config.py controls the parameters of TensorFlow training like learning rate. example_full_softmax.py, example_sampled_softmax.py are example files for Amazon-670K dataset with full softmax and sampled softmax respectively.

Build/Run on Intel platform

Prerequisites:

CMake >= 3.0 Intel Compiler (ICC) >= 19

Build with ICC compiler

source /opt/intel/compilers_and_libraries/linux/bin/compilervars.sh -arch intel64 -platform linux
cd /path/to/slide-root
mkdir -p bin && cd bin 
# BDW (AVX2)
cmake .. -DCMAKE_CXX_COMPILER=icpc -DCMAKE_C_COMPILER=icc
# SKX/CLX (AVX512)
cmake .. -DCMAKE_CXX_COMPILER=icpc -DCMAKE_C_COMPILER=icc -DOPT_AVX512=1
# CPX (AVX512 + BF16)
cmake .. -DCMAKE_CXX_COMPILER=icpc -DCMAKE_C_COMPILER=icc -DOPT_AVX512=1 -DOPT_AVX512_BF16=1
make -j

Run on Intel SKX/CLX/CPX

cd bin
OMP_NUM_THREADS= KMP_HW_SUBSET=s,c,t KMP_AFFINITY=compact,granularity=fine KMP_BLOCKTIME=200 ./runme ../SLIDE/Config_amz.csv
For example, on CLX8280 2Sx28c:
OMP_NUM_THREADS=112 KMP_HW_SUBSET=2s,28c,2t KMP_AFFINITY=compact,granularity=fine KMP_BLOCKTIME=200 ./runme ../SLIDE/Config_amz.csv

For best performance please set Batchsize=multiple-of-logic-core-number from SLIDE/Config_amz.csv.

Results can be checked from the log file under dataset:

tail -f dataset/log.txt
Owner
Intel Labs
Intel Labs
Application of the L2HMC algorithm to simulations in lattice QCD.

l2hmc-qcd ๐Ÿ“Š Slides Recent talk on Training Topological Samplers for Lattice Gauge Theory from the Machine Learning for High Energy Physics, on and of

Sam Foreman 37 Dec 14, 2022
A package related to building quasi-fibration symmetries

qf A package related to building quasi-fibration symmetries. If you'd like to learn more about how it works, see the brief explanation and References

Paolo Boldi 1 Dec 01, 2021
Pytorch implementation of AREL

Status: Archive (code is provided as-is, no updates expected) Agent-Temporal Attention for Reward Redistribution in Episodic Multi-Agent Reinforcement

8 Nov 25, 2022
PyTorch implementation of Deep HDR Imaging via A Non-Local Network (TIP 2020).

NHDRRNet-PyTorch This is the PyTorch implementation of Deep HDR Imaging via A Non-Local Network (TIP 2020). 0. Differences between Original Paper and

Yutong Zhang 1 Mar 01, 2022
DM-ACME compatible implementation of the Arm26 environment from Mujoco

ACME-compatible implementation of Arm26 from Mujoco This repository contains a customized implementation of Mujoco's Arm26 model, that can be used wit

1 Dec 24, 2021
PyGCL: A PyTorch Library for Graph Contrastive Learning

PyGCL is a PyTorch-based open-source Graph Contrastive Learning (GCL) library, which features modularized GCL components from published papers, standa

PyGCL 588 Dec 31, 2022
codes for Self-paced Deep Regression Forests with Consideration on Ranking Fairness

Self-paced Deep Regression Forests with Consideration on Ranking Fairness This is official codes for paper Self-paced Deep Regression Forests with Con

Learning in Vision 4 Sep 11, 2022
Code to train models from "Paraphrastic Representations at Scale".

Paraphrastic Representations at Scale Code to train models from "Paraphrastic Representations at Scale". The code is written in Python 3.7 and require

John Wieting 71 Dec 19, 2022
PyTorch implementation of "PatchGame: Learning to Signal Mid-level Patches in Referential Games" to appear in NeurIPS 2021

PatchGame: Learning to Signal Mid-level Patches in Referential Games This repository is the official implementation of the paper - "PatchGame: Learnin

Kamal Gupta 22 Mar 16, 2022
Thermal Control of Laser Powder Bed Fusion using Deep Reinforcement Learning

This repository is the implementation of the paper "Thermal Control of Laser Powder Bed Fusion Using Deep Reinforcement Learning", linked here. The project makes use of the Deep Reinforcement Library

BaratiLab 11 Dec 27, 2022
๐Ÿฆ™ LaMa Image Inpainting, Resolution-robust Large Mask Inpainting with Fourier Convolutions, WACV 2022

๐Ÿฆ™ LaMa Image Inpainting, Resolution-robust Large Mask Inpainting with Fourier Convolutions, WACV 2022

Advanced Image Manipulation Lab @ Samsung AI Center Moscow 4.7k Dec 31, 2022
Official pytorch implementation of the AAAI 2021 paper Semantic Grouping Network for Video Captioning

Semantic Grouping Network for Video Captioning Hobin Ryu, Sunghun Kang, Haeyong Kang, and Chang D. Yoo. AAAI 2021. [arxiv] Environment Ubuntu 16.04 CU

Hobin Ryu 43 Nov 25, 2022
Build and run Docker containers leveraging NVIDIA GPUs

NVIDIA Container Toolkit Introduction The NVIDIA Container Toolkit allows users to build and run GPU accelerated Docker containers. The toolkit includ

NVIDIA Corporation 15.6k Jan 01, 2023
[Machine Learning Engineer Basic Guide] ๋ถ€์ŠคํŠธ์บ ํ”„ AI Tech - Product Serving ์ž๋ฃŒ

Boostcamp-AI-Tech-Product-Serving ๋ถ€์ŠคํŠธ์บ ํ”„ AI Tech - Product Serving ์ž๋ฃŒ Repository ๊ตฌ์กฐ part1(MLOps ๊ฐœ๋ก , Model Serving, ๋จธ์‹ ๋Ÿฌ๋‹ ํ”„๋กœ์ ํŠธ ๋ผ์ดํ”„ ์‚ฌ์ดํด์€ ๋ณ„๋„์˜ ์ฝ”๋“œ๊ฐ€ ์—†์œผ๋ฉฐ, part

Sung Yun Byeon 269 Dec 21, 2022
List of content farm sites like g.penzai.com.

ๅ†…ๅฎนๅ†œๅœบ็ฝ‘็ซ™ๆธ…ๅ• Google ไธญๆ–‡ๆœ็ดข็ป“ๆžœๅŒ…ๅซไบ†็›ธๅฝ“ไธ€้ƒจๅˆ†็š„ๅ†…ๅฎนๅ†œๅœบๅผๆก็›ฎ๏ผŒๆฏ”ๅฆ‚ใ€Œๅฐ X ็Ÿฅ่ฏ†็ฝ‘ใ€ใ€Œๅฐ X ็™พ็ง‘็ฝ‘ใ€ใ€‚ๆญค็ง้“พๆŽฅๅธธไผš 302 ้‡ๅฎšๅ‘ๅ…ถไธป็ซ™๏ผŒ้กต้ขๅ†…ๅฎนไธบ่‡ชๅŠจ็”Ÿๆˆ๏ผŒๅคง้‡ๅ †ๅ ๅ…ณ้”ฎๅญ—๏ผŒๆ‰ๆ‚ไธ€ไบ›็ˆฌๅ–ๅˆฐ็š„ๅ†…ๅฎน๏ผŒๅฎŒๅ…จไธๅ…ทๅฏ่ฏปๆ€งๅ’Œๅ‚่€ƒไปทๅ€ผใ€‚ ๅฐคไธบ่ฟ‡ๅˆ†็š„ๆ˜ฏ๏ผŒ่ฏฅ็ฑป็ฝ‘็ซ™ๅฏ่ƒฝๆœ‰ๆˆๅƒไธŠไธ‡ไธชๅˆ†่บซๅŸŸๅ่ขซ Goog

WDMPA 541 Jan 03, 2023
This repository contains the needed resources to build the HIRID-ICU-Benchmark dataset

HiRID-ICU-Benchmark This repository contains the needed resources to build the HIRID-ICU-Benchmark dataset for which the manuscript can be found here.

Biomedical Informatics at ETH Zurich 30 Dec 16, 2022
code for "Self-supervised edge features for improved Graph Neural Network training",

Self-supervised edge features for improved Graph Neural Network training Data availability: Here is a link to the raw data for the organoids dataset.

Neal Ravindra 23 Dec 02, 2022
Rotation Robust Descriptors

RoRD Rotation-Robust Descriptors and Orthographic Views for Local Feature Matching Project Page | Paper link Evaluation and Datasets MMA : Training on

Udit Singh Parihar 25 Nov 15, 2022
A python script to lookup Passport Index Dataset

visa-cli A python script to lookup Passport Index Dataset Installation pip install visa-cli Usage usage: visa-cli [-h] [-d DESTINATION_COUNTRY] [-f]

rand-net 16 Oct 18, 2022
A Quick and Dirty Progressive Neural Network written in TensorFlow.

prog_nn .โ–„โ–„ ยท โ–„ยท โ–„โ–Œ โ– โ–„ โ–„โ–„โ–„ยท โ– โ–„ โ–โ–ˆ โ–€. โ–โ–ˆโ–ชโ–ˆโ–ˆโ–Œโ€ขโ–ˆโ–Œโ–โ–ˆโ–โ–ˆ โ–„โ–ˆโ–ช โ€ขโ–ˆโ–Œโ–โ–ˆ โ–„โ–€โ–€โ–€โ–ˆโ–„โ–โ–ˆโ–Œโ–โ–ˆโ–ชโ–โ–ˆโ–โ–โ–Œ โ–ˆโ–ˆโ–€

SynPon 53 Dec 12, 2022