VFormer
A PyTorch library for Vision Transformers
Getting Started
Read the contributing guidelines in CONTRIBUTING.rst to learn how to start contributing.
Read the contributing guidelines in CONTRIBUTING.rst to learn how to start contributing.
viz module.We can replace _Projection class with a one-liner if-else statement.
Should we replace it with if-else or should we keep the current implementation?
cc: @NeelayS @aditya-agrawal-30502 @alvanli
During the last PR (#45), I had to revert back because of compatibility issues
In this PR I have added some docstrings and Minor changes like changing variable names
this PR is the same as - #48 with edited title :)
@NeelayS
AbsolutePositionEmbedding class was structured specifically for the PVT, but we can use it in other models too if we re-structure it properly, it should also support sinusoidal position embedding or a separate class for Sinusoidal embedding also works.
enhancementThis paper describes how promoting smoothness with a recently proposed sharpness-aware optimizer substantially improves the performance of ViTs.
It would be good to have an implementation of this optimizer in our library. It would fit in the functional module.
I have added some fixes for page breaks in #86.
Still, we need to enhance the docs for visualization methods.
We can include the license/copyright disclaimer for visualization methods in our license or have a separate file.
Additionally, we can add the sample outputs from these methods into the doc.
CC : @NeelayS @aditya-agrawal-30502 @alvanli
documentation enhancement good first issuepaper - https://arxiv.org/abs/2202.09741 code- https://github.com/Visual-Attention-Network/VAN-Classification https://github.com/Visual-Attention-Network/VAN-Segmentation
Paper implementationFirst release of VFormer!
NSGDC Some codes in this repo are copied/modified from opensource implementations made available by UNITER, PyTorch, HuggingFace, OpenNMT, and Nvidia.
Malware Env for OpenAI Gym Citing If you use this code in a publication please cite the following paper: Hyrum S. Anderson, Anant Kharkar, Bobby Fila
Active Learning with the Nvidia TLT Tutorial on active learning with the Nvidia Transfer Learning Toolkit (TLT). In this tutorial, we will show you ho
PlantDoc: A Dataset for Visual Plant Disease Detection This repository contains the Cropped-PlantDoc dataset used for benchmarking classification mode
Randomised controlled trial abstract result tabulator RCT-ART is an NLP pipeline built with spaCy for converting clinical trial result sentences into
ffcv ImageNet Training A minimal, single-file PyTorch ImageNet training script designed for hackability. Run train_imagenet.py to get... ...high accur
3D human pose estimation in video with temporal convolutions and semi-supervised training This is the implementation of the approach described in the
Code for the paper "There is no Double-Descent in Random Forests" This repository contains the code to run the experiments for our paper called "There
Language Emergence in Multi Agent Dialog Code for the Paper Natural Language Does Not Emerge 'Naturally' in Multi-Agent Dialog Satwik Kottur, José M.
Sleep_Staging_Knowledge Distillation This codebase implements knowledge distillation approach for ECG based sleep staging assisted by EEG based sleep
Quantile Regression DQN Quantile Regression DQN a Minimal Working Example, Distributional Reinforcement Learning with Quantile Regression (https://arx
Complementarity is the King: Multi-modal and Multi-grained Hierarchical Semantic Enhancement Network for Cross-modal Retrieval (M2HSE) PyTorch code fo
CORNELLSASLAB SAS output to EXCEL converter for Cornell/MIT Language and acquisition lab Instructions: This python code can be used to convert SAS out
H3DS Dataset This repository contains the code for using the H3DS dataset introduced in H3D-Net: Few-Shot High-Fidelity 3D Head Reconstruction Access
Rotated Box Is Back : Accurate Box Proposal Network for Scene Text Detection This material is supplementray code for paper accepted in ICDAR 2021 We h
Winning submission to the 2021 Brain Tumor Segmentation Challenge This repo contains the codes and pretrained weights for the winning submission to th
SPAAR Description A toolset of Python programs for signal modeling via sparse semilinear autoregressors. References Vides, F. (2021). Computing Semili
EdiBERT, a generative model for image editing EdiBERT is a generative model based on a bi-directional transformer, suited for image manipulation. The
Python wrapper class for OpenVINO Model Server. User can submit inference request to OVMS with just a few lines of code.
Swin Transformer for Object Detection This repo contains the supported code and configuration files to reproduce object detection results of Swin Tran