An easy-to-use app to visualise attentions of various VQA models.

Last update: Nov 13, 2022

Overview

Ask Me Anything: A tool for visualising Visual Question Answering (AMA)

An easy-to-use app to visualise attentions of various VQA models. Please click here to see a live demo of the app!

• Models
• Requirements
• Installation
• How to run
• How to use
• Contributing
• Acknowledgements

Models

• MFB - Multi-modal Factorized Bilinear Pooling with Co-Attention Learning for Visual Question Answering
Zhou Yu, Jun Yu, Jianping Fan, Dacheng Tao
Arxiv

• (Coming soon) MCAN - Deep Modular Co-Attention Networks for Visual Question Answering
Zhou Yu, Jun Yu, Yuhao Cui, Dacheng Tao, Qi Tian
Arvix

Requirements

Please check the requirements.txt file for the version numbers.

opencv_python==4.4.0.46
numpy==1.19.4
pandas==1.1.4
torch==1.4.0
matplotlib==3.3.2
gdown==3.12.2
seaborn==0.11.0
dotmap==1.3.23
streamlit==0.70.0
Pillow==8.0.1
PyYAML==5.3.1

Installation

Install Anaconda
Clone this repository and cd into it.
git clone https://github.com/apugoneappu/ask_me_anything.git && cd ask_me_anything
In a new environment (new_env)
pip install -r requirements.txt

How to run

From the directory of this repository, do the following -

conda activate new_env
streamlit run main.py
In a browser tab, open the Network URL displayed in your terminal.

Done! 🎉

How to use

Contributing

First of all, thank you for wanting to contribute to this work! I will try and make your job as easy as possible. Detailed instructions coming soon ...

Acknowledgements

This repository has been built by modifying the OpenVQA repository.

I would also like to thank Yash Khandelwal, Nikhil Shah and Chinmay Singh for their support and amazing suggestions!

Huge thanks to Streamlit for making all of this possible and for Streamlit Sharing that enables free hosting of this app! ❤️

An easy-to-use app to visualise attentions of various VQA models.

Related tags

Overview

Ask Me Anything: A tool for visualising Visual Question Answering (AMA)

Models

Requirements

Installation

How to run

How to use

Contributing

Acknowledgements

Owner

Apoorve

Efficient Householder transformation in PyTorch

Using OpenAI's CLIP to upscale and enhance images

Informal Persian Universal Dependency Treebank

Machine Learning From Scratch. Bare bones NumPy implementations of machine learning models and algorithms with a focus on accessibility. Aims to cover everything from linear regression to deep learning.

Model-based 3D Hand Reconstruction via Self-Supervised Learning, CVPR2021

Implement A3C for Mujoco gym envs

(CVPR2021) Kaleido-BERT: Vision-Language Pre-training on Fashion Domain

Code for "Unsupervised Source Separation via Bayesian inference in the latent domain"

This is a Tensorflow implementation of Learning to See in the Dark in CVPR 2018

Metadata-Extractor - Metadata Extractor Script can be used to read in exif metadata

Evaluation and Benchmarking of Speech Super-resolution Methods

Unofficial implementation of the Involution operation from CVPR 2021

Configure SRX interfaces with Scrapli

Pytorch library for end-to-end transformer models training and serving

Anime Face Detector using mmdet and mmpose

GLANet - The code for Global and Local Alignment Networks for Unpaired Image-to-Image Translation arxiv

A tool for calculating distortion parameters in coordination complexes.

Bayesian Inference Tools in Python

Few-Shot Object Detection via Association and DIscrimination

Object-Centric Learning with Slot Attention