A curated list of the top 10 computer vision papers in 2021 with video demos, articles, code and paper reference.

Overview

The Top 10 Computer Vision Papers of 2021

The top 10 computer vision papers in 2021 with video demos, articles, code, and paper reference.

While the world is still recovering, research hasn't slowed its frenetic pace, especially in the field of artificial intelligence. More, many important aspects were highlighted this year, like the ethical aspects, important biases, governance, transparency and much more. Artificial intelligence and our understanding of the human brain and its link to AI are constantly evolving, showing promising applications improving our life's quality in the near future. Still, we ought to be careful with which technology we choose to apply.

"Science cannot tell us what we ought to do, only what we can do."
- Jean-Paul Sartre, Being and Nothingness

Here are my top 10 of the most interesting research papers of the year in computer vision, in case you missed any of them. In short, it is basically a curated list of the latest breakthroughs in AI and CV with a clear video explanation, link to a more in-depth article, and code (if applicable). Enjoy the read, and let me know if I missed any important papers in the comments, or by contacting me directly on LinkedIn!

The complete reference to each paper is listed at the end of this repository.

Maintainer: louisfb01

Subscribe to my newsletter - The latest updates in AI explained every week.

Feel free to message me any interesting paper I may have missed to add to this repository.

Tag me on Twitter @Whats_AI or LinkedIn @Louis (What's AI) Bouchard if you share the list!

Watch the 2021 CV rewind


Missed last year? Check this out: 2020: A Year Full of Amazing AI papers- A Review


đź‘€ If you'd like to support my work and use W&B (for free) to track your ML experiments and make your work reproducible or collaborate with a team, you can try it out by following this guide! Since most of the code here is PyTorch-based, we thought that a QuickStart guide for using W&B on PyTorch would be most interesting to share.

👉 Follow this quick guide, use the same W&B lines in your code or any of the repos below, and have all your experiments automatically tracked in your w&b account! It doesn't take more than 5 minutes to set up and will change your life as it did for me! Here's a more advanced guide for using Hyperparameter Sweeps if interested :)

🙌 Thank you to Weights & Biases for sponsoring this repository and the work I've been doing, and thanks to any of you using this link and trying W&B!

Open In Colab


If you are interested in AI research, here is another great repository for you:

A curated list of the latest breakthroughs in AI by release date with a clear video explanation, link to a more in-depth article, and code.

2021: A Year Full of Amazing AI papers- A Review


The Full List


DALL·E: Zero-Shot Text-to-Image Generation from OpenAI [1]

OpenAI successfully trained a network able to generate images from text captions. It is very similar to GPT-3 and Image GPT and produces amazing results.

Taming Transformers for High-Resolution Image Synthesis [2]

Tl;DR: They combined the efficiency of GANs and convolutional approaches with the expressivity of transformers to produce a powerful and time-efficient method for semantically-guided high-quality image synthesis.

Swin Transformer: Hierarchical Vision Transformer using Shifted Windows [3]

Will Transformers Replace CNNs in Computer Vision? In less than 5 minutes, you will know how the transformer architecture can be applied to computer vision with a new paper called the Swin Transformer.

Deep nets: What have they ever done for vision? [bonus]

"I will openly share everything about deep nets for vision applications, their successes, and the limitations we have to address."

Infinite Nature: Perpetual View Generation of Natural Scenes from a Single Image [4]

The next step for view synthesis: Perpetual View Generation, where the goal is to take an image to fly into it and explore the landscape!

Total Relighting: Learning to Relight Portraits for Background Replacement [5]

Properly relight any portrait based on the lighting of the new background you add. Have you ever wanted to change the background of a picture but have it look realistic? If you’ve already tried that, you already know that it isn’t simple. You can’t just take a picture of yourself in your home and change the background for a beach. It just looks bad and not realistic. Anyone will just say “that’s photoshopped” in a second. For movies and professional videos, you need the perfect lighting and artists to reproduce a high-quality image, and that’s super expensive. There’s no way you can do that with your own pictures. Or can you?

If you’d like to read more research papers as well, I recommend you read my article where I share my best tips for finding and reading more research papers.

Animating Pictures with Eulerian Motion Fields [6]

This model takes a picture, understands which particles are supposed to be moving, and realistically animates them in an infinite loop while conserving the rest of the picture entirely still creating amazing-looking videos like this one...

CVPR 2021 Best Paper Award: GIRAFFE - Controllable Image Generation [7]

Using a modified GAN architecture, they can move objects in the image without affecting the background or the other objects!

TimeLens: Event-based Video Frame Interpolation [8]

TimeLens can understand the movement of the particles in-between the frames of a video to reconstruct what really happened at a speed even our eyes cannot see. In fact, it achieves results that our intelligent phones and no other models could reach before!

Subscribe to my weekly newsletter and stay up-to-date with new publications in AI for 2022!

(Style)CLIPDraw: Coupling Content and Style in Text-to-Drawing Synthesis [9]

Have you ever dreamed of taking the style of a picture, like this cool TikTok drawing style on the left, and applying it to a new picture of your choice? Well, I did, and it has never been easier to do. In fact, you can even achieve that from only text and can try it right now with this new method and their Google Colab notebook available for everyone (see references). Simply take a picture of the style you want to copy, enter the text you want to generate, and this algorithm will generate a new picture out of it! Just look back at the results above, such a big step forward! The results are extremely impressive, especially if you consider that they were made from a single line of text!

CityNeRF: Building NeRF at City Scale [10]

The model is called CityNeRF and grows from NeRF, which I previously covered on my channel. NeRF is one of the first models using radiance fields and machine learning to construct 3D models out of images. But NeRF is not that efficient and works for a single scale. Here, CityNeRF is applied to satellite and ground-level images at the same time to produce various 3D model scales for any viewpoint. In simple words, they bring NeRF to city-scale. But how?


If you would like to read more papers and have a broader view, here is another great repository for you covering 2020: 2020: A Year Full of Amazing AI papers- A Review and feel free to subscribe to my weekly newsletter and stay up-to-date with new publications in AI for 2022!

Tag me on Twitter @Whats_AI or LinkedIn @Louis (What's AI) Bouchard if you share the list!


Paper references

[1] A. Ramesh et al., Zero-shot text-to-image generation, 2021. arXiv:2102.12092

[2] Taming Transformers for High-Resolution Image Synthesis, Esser et al., 2020.

[3] Liu, Z. et al., 2021, “Swin Transformer: Hierarchical Vision Transformer using Shifted Windows”, arXiv preprint https://arxiv.org/abs/2103.14030v1

[bonus] Yuille, A.L., and Liu, C., 2021. Deep nets: What have they ever done for vision?. International Journal of Computer Vision, 129(3), pp.781–802, https://arxiv.org/abs/1805.04025.

[4] Liu, A., Tucker, R., Jampani, V., Makadia, A., Snavely, N. and Kanazawa, A., 2020. Infinite Nature: Perpetual View Generation of Natural Scenes from a Single Image, https://arxiv.org/pdf/2012.09855.pdf

[5] Pandey et al., 2021, Total Relighting: Learning to Relight Portraits for Background Replacement, doi: 10.1145/3450626.3459872, https://augmentedperception.github.io/total_relighting/total_relighting_paper.pdf.

[6] Holynski, Aleksander, et al. “Animating Pictures with Eulerian Motion Fields.” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021.

[7] Michael Niemeyer and Andreas Geiger, (2021), "GIRAFFE: Representing Scenes as Compositional Generative Neural Feature Fields", Published in CVPR 2021.

[8] Stepan Tulyakov*, Daniel Gehrig*, Stamatios Georgoulis, Julius Erbach, Mathias Gehrig, Yuanyou Li, Davide Scaramuzza, TimeLens: Event-based Video Frame Interpolation, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, 2021, http://rpg.ifi.uzh.ch/docs/CVPR21_Gehrig.pdf

[9] a) CLIPDraw: exploring text-to-drawing synthesis through language-image encoders
b) StyleCLIPDraw: Schaldenbrand, P., Liu, Z. and Oh, J., 2021. StyleCLIPDraw: Coupling Content and Style in Text-to-Drawing Synthesis.

[10] Xiangli, Y., Xu, L., Pan, X., Zhao, N., Rao, A., Theobalt, C., Dai, B. and Lin, D., 2021. CityNeRF: Building NeRF at City Scale.

Owner
Louis-François Bouchard
Master student, AI Research Scientist, and speaker on YouTube.
Louis-François Bouchard
Expand human face editing via Global Direction of StyleCLIP, especially to maintain similarity during editing.

Oh-My-Face This project is based on StyleCLIP, RIFE, and encoder4editing, which aims to expand human face editing via Global Direction of StyleCLIP, e

AiLin Huang 51 Nov 17, 2022
Few-Shot Object Detection via Association and DIscrimination

Few-Shot Object Detection via Association and DIscrimination Code release of our NeurIPS 2021 paper: Few-Shot Object Detection via Association and DIs

Cao Yuhang 49 Dec 18, 2022
PyTorch implementation of Interpretable Explanations of Black Boxes by Meaningful Perturbation

PyTorch implementation of Interpretable Explanations of Black Boxes by Meaningful Perturbation The paper: https://arxiv.org/abs/1704.03296 What makes

Jacob Gildenblat 322 Dec 17, 2022
It helps user to learn Pick-up lines and share if he has a better one

Pick-up-Lines-Generator(Open Source) It helps user to learn Pick-up lines Share and Add one or many to the DataBase Unique SQLite DataBase AI Undercon

knock_nott 0 May 04, 2022
RTSeg: Real-time Semantic Segmentation Comparative Study

Real-time Semantic Segmentation Comparative Study The repository contains the official TensorFlow code used in our papers: RTSEG: REAL-TIME SEMANTIC S

Mennatullah Siam 592 Nov 18, 2022
Deep Sea Treasure Environment for Multi-Objective Optimization Research

DeepSeaTreasure Environment Installation In order to get started with this environment, you can install it using the following command: python3 -m pip

imec IDLab 6 Nov 14, 2022
Road Crack Detection Using Deep Learning Methods

Road-Crack-Detection-Using-Deep-Learning-Methods This is my Diploma Thesis ¨Road Crack Detection Using Deep Learning Methods¨ under the supervision of

Aggelos Katsaliros 3 May 03, 2022
Repo for our ICML21 paper Unsupervised Learning of Visual 3D Keypoints for Control

Unsupervised Learning of Visual 3D Keypoints for Control [Project Website] [Paper] Boyuan Chen1, Pieter Abbeel1, Deepak Pathak2 1UC Berkeley 2Carnegie

Boyuan Chen 34 Jul 22, 2022
A web-based application for quick, scalable, and automated hyperparameter tuning and stacked ensembling in Python.

Xcessiv Xcessiv is a tool to help you create the biggest, craziest, and most excessive stacked ensembles you can think of. Stacked ensembles are simpl

Reiichiro Nakano 1.3k Nov 17, 2022
HybridNets: End-to-End Perception Network

HybridNets: End2End Perception Network HybridNets Network Architecture. HybridNets: End-to-End Perception Network by Dat Vu, Bao Ngo, Hung Phan đź“§ FPT

Thanh Dat Vu 370 Dec 29, 2022
Unofficial implementation of PatchCore anomaly detection

PatchCore anomaly detection Unofficial implementation of PatchCore(new SOTA) anomaly detection model Original Paper : Towards Total Recall in Industri

Changwoo Ha 268 Dec 22, 2022
Official PyTorch implementation of UACANet: Uncertainty Aware Context Attention for Polyp Segmentation

UACANet: Uncertainty Aware Context Attention for Polyp Segmentation Official pytorch implementation of UACANet: Uncertainty Aware Context Attention fo

Taehun Kim 85 Dec 14, 2022
Code and Datasets from the paper "Self-supervised contrastive learning for volcanic unrest detection from InSAR data"

Code and Datasets from the paper "Self-supervised contrastive learning for volcanic unrest detection from InSAR data" You can download the pretrained

Bountos Nikos 3 May 07, 2022
Rainbow is all you need! A step-by-step tutorial from DQN to Rainbow

Do you want a RL agent nicely moving on Atari? Rainbow is all you need! This is a step-by-step tutorial from DQN to Rainbow. Every chapter contains bo

Jinwoo Park (Curt) 1.4k Dec 29, 2022
Code of paper "CDFI: Compression-Driven Network Design for Frame Interpolation", CVPR 2021

CDFI (Compression-Driven-Frame-Interpolation) [Paper] (Coming soon...) | [arXiv] Tianyu Ding*, Luming Liang*, Zhihui Zhu, Ilya Zharkov IEEE Conference

Tianyu Ding 95 Dec 04, 2022
Dynamic Token Normalization Improves Vision Transformers

Dynamic Token Normalization Improves Vision Transformers This is the PyTorch implementation of the paper Dynamic Token Normalization Improves Vision T

Wenqi Shao 20 Oct 09, 2022
A very impractical 3D rendering engine that runs in the python terminal.

Terminal-3D-Render A very impractical 3D rendering engine that runs in the python terminal. do NOT try to run this program using the standard python I

23 Dec 31, 2022
Label-Free Model Evaluation with Semi-Structured Dataset Representations

Label-Free Model Evaluation with Semi-Structured Dataset Representations Prerequisites This code uses the following libraries Python 3.7 NumPy PyTorch

8 Oct 06, 2022
Global-Local Attention for Emotion Recognition

Global-Local Attention for Emotion Recognition Requirements Python 3 Install tensorflow (or tensorflow-gpu) = 2.0.0 Install some other packages pip i

Minh Nhat Le 15 Apr 21, 2022
Code for NeurIPS2021 submission "A Surrogate Objective Framework for Prediction+Programming with Soft Constraints"

This repository is the code for NeurIPS 2021 submission "A Surrogate Objective Framework for Prediction+Programming with Soft Constraints". Edit 2021/

10 Dec 20, 2022