[ICCV2021] IICNet: A Generic Framework for Reversible Image Conversion

Related tags

Deep LearningIICNet
Overview

IICNet - Invertible Image Conversion Net

Official PyTorch Implementation for IICNet: A Generic Framework for Reversible Image Conversion (ICCV2021). Demo Video | Supplements

Introduction

Reversible image conversion (RIC) aims to build a reversible transformation between specific visual content (e.g., short videos) and an embedding image, where the original content can be restored from the embedding when necessary. This work develops Invertible Image Conversion Net (IICNet) as a generic solution to various RIC tasks due to its strong capacity and task-independent design. Unlike previous encoder-decoder based methods, IICNet maintains a highly invertible structure based on invertible neural networks (INNs) to better preserve the information during conversion. We use a relation module and a channel squeeze layer to improve the INN nonlinearity to extract cross-image relations and the network flexibility, respectively. Experimental results demonstrate that IICNet outperforms the specifically-designed methods on existing RIC tasks and can generalize well to various newly-explored tasks. With our generic IICNet, we no longer need to hand-engineer task-specific embedding networks for rapidly occurring visual content.

Installation

Clone this repository and set up the environment.

git clone https://github.com/felixcheng97/IICNet.git
cd IICNet/
conda env create -f iic.yml

Dataset Preparation

We conduct experments on 5 multiple-and-single RIC tasks in the main paper and 2 single-and-single RIC tasks in the supplements. Note that all the datasets are placed under the ./datasets directory.

Task 1: Spatial-Temporal Video Embedding

We use the high-quality DAVIS 2017 video dataset in this task. You could download the Semi-supervised 480p dataset through this link. Unzip, rename, and place them under the dataset directory with the following structure.

.
`-- datasets
    |-- Adobe-Matting
    |-- DAVIS-2017
    |   |-- DAVIS-2017-test-challenge (rename the DAVIS folder from DAVIS-2017-test-challenge-480p.zip)
    |   |-- DAVIS-2017-test-dev       (rename the DAVIS folder from DAVIS-2017-test-dev-480p.zip)
    |   `-- DAVIS-2017-trainval       (rename the DAVIS folder from DAVIS-2017-trainval-480p.zip)
    |-- DIV2K
    |-- flicker
    |-- flicker1024
    |-- Real-Matting
    `-- VOCdevkit

Then run the following scripts for annotation.

cd codes/scripts
python davis_annotation.py

Task 2: Mononizing Binocular Images

We use the Flickr1024 dataset with the official train and test splits in this task. You could download the dataset through this link. Place the dataset under the dataset directory with the following structure.

.
`-- datasets
    |-- Adobe-Matting
    |-- DAVIS-2017
    |-- DIV2K
    |-- flicker
    |-- flicker1024
    |   |-- Test
    |   |-- Train_1
    |   |-- Train_2
    |   |-- Train_3
    |   |-- Train_4
    |   `-- Validation
    |-- Real-Matting
    `-- VOCdevkit

Then run the following scripts for annotation.

cd codes/scripts
python flicker1024_annotation.py

Task 3: Embedding Dual-View Images

We use the DIV2K dataset in this task. You could download the dataset through this link. Download the corresponding datasets and place them under the dataset directory with the following structure.

.
`-- datasets
    |-- Adobe-Matting
    |-- DAVIS-2017
    |-- DIV2K
    |   |-- DIV2K_train_HR
    |   |-- DIV2K_train_LR_bicubic
    |   |   |-- X2
    |   |   |-- X4
    |   |   |-- X8
    |   |-- DIV2K_valid_HR
    |   `-- DIV2K_valid_LR_bicubic
    |       |-- X2
    |       |-- X4
    |       `-- X8
    |-- flicker
    |-- flicker1024
    |-- Real-Matting
    `-- VOCdevkit

Then run the following scripts for annotation.

cd codes/scripts
python div2kddual_annotation.py

Task 4: Embedding Multi-Layer Images / Composition and Decomposition

We use the Adobe Deep Matting dataset and the Real Matting dataset in this task. You could download the Adobe Deep Matting dataset according to their instructions through this link. You could download the Real Matting dataset on its official GitHub page or through this direct link. Place the downloaded datasets under the dataset directory with the following structure.

.
`-- datasets
    |-- Adobe-Matting
    |   |-- Addobe_Deep_Matting_Dataset.zip
    |   |-- train2014.zip
    |   |-- VOC2008test.tar
    |   `-- VOCtrainval_14-Jul-2008.tar
    |-- DAVIS-2017
    |-- DIV2K
    |-- flicker
    |-- flicker1024
    |-- Real-Matting
    |   |-- fixed-camera
    |   `-- hand-held
    `-- VOCdevkit

Then run the following scripts for annotation.

cd codes/scripts

# process the Adobe Matting dataset
python adobe_process.py
python adobe_annotation.py

# process the Real Matting dataset
python real_process.py
python real_annotation.py

Task 5: Hiding Images in an Image

We use the Flicker 2W dataset in this task. You could download the dataset on its official GitHub page through this link. Place the unzipped dataset under the datasets directory with the following structure.

.
`-- datasets
    |-- Adobe-Matting
    |-- DAVIS-2017
    |-- DIV2K
    |-- flicker
    |   `-- flicker_2W_images
    |-- flicker1024
    |-- Real-Matting
    `-- VOCdevkit

Then run the following scripts for annotation.

cd codes/scripts
python flicker_annotation.py

Task 6 (supp): Invertible Grayscale

We use the VOC2012 dataset in this task. You could download the training/validation dataset through this link. Place the unzipped dataset under the datasets directory with the following structure.

.
`-- datasets
    |-- Adobe-Matting
    |-- DAVIS-2017
    |-- DIV2K
    |-- flicker
    |-- flicker1024
    |-- Real-Matting
    `-- VOCdevkit
        `-- VOC2012

Then run the following scripts for annotation

cd codes/scripts
python voc2012_annotation.py

Task 7 (supp): Invertible Image Rescaling

We use the DIV2K dataset in this task. Please check Task 3: Embedding Dual-View Images to download the corresponding dataset. Then run the following scripts for annotation.

cd codes/scripts
python div2ksr_annotation.py

Training

To train a model for a specific task, run the following script:

cd codes
OMP_NUM_THREADS=4 python train.py -opt ./conf/train/<xxx>.yml

To enable distributed training with multiple GPUs for a specific task, simply assign a list of gpu_ids in the yml file and run the following script. Note that since training with multiple GPU is not tested yet, we suggest to train a model with a single GPU.

cd codes
OMP_NUM_THREADS=4 python -m torch.distributed.launch --nproc_per_node=4 --master_port 29501 train.py -opt ./conf/train/<xxx>.yml

Testing

We provide our trained models in our paper for your reference. Download all the pretrained weights of our models from Google Drive or Baidu Drive (extraction code: e377). Unzip the zip file and place pretrained models under the ./experiments directory.

To test a model for a specific task, run the following script:

cd codes
OMP_NUM_THREADS=4 python test.py -opt ./conf/test/<xxx>.yml

Acknowledgement

Some codes of this repository benefits from Invertible Image Rescaling (IRN).

Citation

If you find this work useful, please cite our paper:

@inproceedings{cheng2021iicnet,
    title = {IICNet: A Generic Framework for Reversible Image Conversion}, 
    author = {Ka Leong Cheng and Yueqi Xie and Qifeng Chen},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision},
    year = {2021}
}

Contact

Feel free to open an issue if you have any question. You could also directly contact us through email at [email protected] (Ka Leong Cheng) and [email protected] (Yueqi Xie).

Owner
felixcheng97
felixcheng97
Implementation for Simple Spectral Graph Convolution in ICLR 2021

Simple Spectral Graph Convolutional Overview This repo contains an example implementation of the Simple Spectral Graph Convolutional (S^2GC) model. Th

allenhaozhu 64 Dec 31, 2022
Overview of architecture and implementation of TEDS-Net, as described in MICCAI 2021: "TEDS-Net: Enforcing Diffeomorphisms in Spatial Transformers to Guarantee TopologyPreservation in Segmentations"

TEDS-Net Overview of architecture and implementation of TEDS-Net, as described in MICCAI 2021: "TEDS-Net: Enforcing Diffeomorphisms in Spatial Transfo

Madeleine K Wyburd 14 Jan 04, 2023
This tool uses Deep Learning to help you draw and write with your hand and webcam.

This tool uses Deep Learning to help you draw and write with your hand and webcam. A Deep Learning model is used to try to predict whether you want to have 'pencil up' or 'pencil down'.

lmagne 169 Dec 10, 2022
Create images and texts with the First Order Generative Adversarial Networks

First Order Divergence for training GANs This repository contains code accompanying the paper First Order Generative Advesarial Netoworks The majority

Zalando Research 35 Dec 11, 2021
Reading list for research topics in Masked Image Modeling

awesome-MIM Reading list for research topics in Masked Image Modeling(MIM). We list the most popular methods for MIM, if I missed something, please su

ligang 231 Dec 07, 2022
⚡️Optimizing einsum functions in NumPy, Tensorflow, Dask, and more with contraction order optimization.

Optimized Einsum Optimized Einsum: A tensor contraction order optimizer Optimized einsum can significantly reduce the overall execution time of einsum

Daniel Smith 653 Dec 30, 2022
ADSPM: Attribute-Driven Spontaneous Motion in Unpaired Image Translation

ADSPM: Attribute-Driven Spontaneous Motion in Unpaired Image Translation This repository provides a PyTorch implementation of ADSPM. Requirements Pyth

24 Jul 24, 2022
Learning Intents behind Interactions with Knowledge Graph for Recommendation, WWW2021

Learning Intents behind Interactions with Knowledge Graph for Recommendation This is our PyTorch implementation for the paper: Xiang Wang, Tinglin Hua

158 Dec 15, 2022
TraSw for FairMOT - A Single-Target Attack example (Attack ID: 19; Screener ID: 24):

TraSw for FairMOT A Single-Target Attack example (Attack ID: 19; Screener ID: 24): Fig.1 Original Fig.2 Attacked By perturbing only two frames in this

Derry Lin 21 Dec 21, 2022
A Python parser that takes the content of a text file and then reads it into variables.

Text-File-Parser A Python parser that takes the content of a text file and then reads into variables. Input.text File 1. What is your ***? 1. 18 -

Kelvin 0 Jul 26, 2021
It helps user to learn Pick-up lines and share if he has a better one

Pick-up-Lines-Generator(Open Source) It helps user to learn Pick-up lines Share and Add one or many to the DataBase Unique SQLite DataBase AI Undercon

knock_nott 0 May 04, 2022
Demo project for real time anomaly detection using kafka and python

kafkaml-anomaly-detection Project for real time anomaly detection using kafka and python It's assumed that zookeeper and kafka are running in the loca

Rodrigo Arenas 36 Dec 12, 2022
GULAG: GUessing LAnGuages with neural networks

GULAG: GUessing LAnGuages with neural networks Classify languages in text via neural networks. Привет! My name is Egor. Was für ein herrliches Frühl

Egor Spirin 12 Sep 02, 2022
Official PyTorch Implementation of GAN-Supervised Dense Visual Alignment

GAN-Supervised Dense Visual Alignment — Official PyTorch Implementation Paper | Project Page | Video This repo contains training, evaluation and visua

944 Jan 07, 2023
Semi-Autoregressive Transformer for Image Captioning

Semi-Autoregressive Transformer for Image Captioning Requirements Python 3.6 Pytorch 1.6 Prepare data Please use git clone --recurse-submodules to clo

YE Zhou 23 Dec 09, 2022
A 3D sparse LBM solver implemented using Taichi

taichi_LBM3D Background Taichi_LBM3D is a 3D lattice Boltzmann solver with Multi-Relaxation-Time collision scheme and sparse storage structure impleme

Jianhui Yang 121 Jan 06, 2023
Tools for the Cleveland State Human Motion and Control Lab

Introduction This is a collection of tools that are helpful for gait analysis. Some are specific to the needs of the Human Motion and Control Lab at C

CSU Human Motion and Control Lab 88 Dec 16, 2022
Real-time 3D multi-person detection made easy with OpenPose and the ZED

OpenPose ZED This sample show how to simply use the ZED with OpenPose, the deep learning framework that detects the skeleton from a single 2D image. T

blanktec 5 Nov 06, 2020
A Tensorflow based library for Time Series Modelling with Gaussian Processes

Markovflow Documentation | Tutorials | API reference | Slack What does Markovflow do? Markovflow is a Python library for time-series analysis via prob

Secondmind Labs 24 Dec 12, 2022
Official PyTorch implementation of Retrieve in Style: Unsupervised Facial Feature Transfer and Retrieval.

Retrieve in Style: Unsupervised Facial Feature Transfer and Retrieval PyTorch This is the PyTorch implementation of Retrieve in Style: Unsupervised Fa

60 Oct 12, 2022