A curated list of papers, code and resources pertaining to image composition

Overview

Awesome Image Composition Awesome

A curated list of resources including papers, datasets, and relevant links pertaining to image composition.

Contributing

Contributions are welcome. If you wish to contribute, feel free to send a pull request. If you have suggestions for new sections to be included, please raise an issue and discuss before sending a pull request.

Table of Contents

Surveys

  • Li Niu, Wenyan Cong, Liu Liu, Yan Hong, Bo Zhang, Jing Liang, Liqing Zhang: "Making Images Real Again: A Comprehensive Survey on Deep Image Composition." arXiv preprint arXiv:2106.14490 (2021). [arXiv]

Papers

Image blending

  • Huikai Wu, Shuai Zheng, Junge Zhang, Kaiqi Huang: "GP-GAN: Towards Realistic High-Resolution Image Blending." ACM MM (2019) [arXiv] [code]
  • Lingzhi Zhang, Tarmily Wen, Jianbo Shi: "Deep Image Blending." WACV (2020) [pdf] [arXiv] [code]

Image harmonization

  • Jun Ling, Han Xue, Li Song, Rong Xie, Xiao Gu: "Region-Aware Adaptive Instance Normalization for Image Harmonization." CVPR (2021) [pdf] [supp] [arXiv] [code].
  • Zonghui Guo, Haiyong Zheng, Yufeng Jiang, Zhaorui Gu, Bing Zheng: "Intrinsic Image Harmonization." CVPR (2021) [pdf] [supp] [code].
  • Wenyan Cong, Li Niu, Jianfu Zhang, Jing Liang, Liqing Zhang: "BargainNet: Background-Guided Domain Translation for Image Harmonization." ICME (2021) [arXiv] [code].
  • Konstantin Sofiiuk, Polina Popenova, Anton Konushin: "Foreground-aware Semantic Representations for Image Harmonization." WACV (2021) [pdf] [supp] [arXiv] [code]
  • Guoqing Hao, Satoshi Iizuka, Kazuhiro Fukui: "Image Harmonization with Attention-based Deep Feature Modulation." BMVC (2020) [pdf] [supp] [code]
  • Wenyan Cong, Jianfu Zhang, Li Niu, Liu Liu, Zhixin Ling, Weiyuan Li, Liqing Zhang: "DoveNet: Deep Image Harmonization via Domain Verification." CVPR (2020) [pdf] [supp] [arXiv] [code].
  • Xiaodong Cun, Chi-Man Pun: "Improving the Harmony of the Composite Image by Spatial-Separated Attention Module." IEEE Trans. Image Process. 29: 4759-4771 (2020) [pdf] [arXiv] [code]
  • Yi-Hsuan Tsai, Xiaohui Shen, Zhe Lin, Kalyan Sunkavalli, Xin Lu, Ming-Hsuan Yang: "Deep Image Harmonization." CVPR (2017) [pdf] [supp] [arXiv] [code]

Shadow generation

  • Daquan Liu, Chengjiang Long, Hongpan Zhang, Hanning Yu, Xinzhi Dong, Chunxia Xiao: "ARshadowGAN: Shadow generative adversarial network for augmented reality in single light scenes." CVPR (2020) [pdf] [code].

  • Shuyang Zhang, Runze Liang, Miao Wang: "ShadowGAN: Shadow synthesis for virtual objects with conditional adversarial networks." Computational Visual Media (2019) [pdf].

  • Fangneng Zhan, Shijian Lu, Changgong Zhang, Feiying Ma, Xuansong Xie: "Adversarial Image Composition with Auxiliary Illumination." ACCV (2020) [pdf].

Object placement and spatial transformation

  • Lingzhi Zhang, Tarmily Wen, Jie Min, Jiancong Wang, David Han, Jianbo Shi: "Learning Object Placement by Inpainting for Compositional Data Augmentation" ECCV (2020) [pdf]

  • Samaneh Azadi, Deepak Pathak, Sayna Ebrahimi, Trevor Darrell: "Compositional GAN: Learning Image-Conditional Binary Composition" International Journal of Computer Vision (2020) [arXiv] [code]

  • Song-Hai Zhang, Zhengping Zhou, Bin Liu, Xi Dong, Peter Hall: "What and Where: A Context-based Recommendation System for Object Insertion" Computational Visual Media (2020) [arXiv]

  • Shashank Tripathi, Siddhartha Chandra, Amit Agrawal, Ambrish Tyagi, James M. Rehg, Visesh Chari: "Learning to Generate Synthetic Data via Compositing" CVPR (2019) [arXiv]

  • Haoshu Fang, Jianhua Sun, Runzhong Wang, Minghao Gou, Yonglu Li, Cewu Lu: "InstaBoost: Boosting Instance Segmentation via Probability Map Guided Copy-Pasting" ICCV (2019) [arXiv] [code]

  • Chen-Hsuan Lin, Ersin Yumer, Oliver Wang, Eli Shechtman, Simon Lucey: "ST-GAN: Spatial Transformer Generative Adversarial Networks for Image Compositing" CVPR (2018) [arXiv] [code]

  • Donghoon Lee, Sifei Liu, Jinwei Gu, Ming-Yu Liu, Ming-Hsuan Yang, Jan Kautz: "Context-Aware Synthesis and Placement of Object Instances" NeurIPS (2018) [arXiv] [code]

  • Fuwen Tan, Crispin Bernier, Benjamin Cohen, Vicente Ordonez, Connelly Barnes: "Where and Who? Automatic Semantic-Aware Person Composition" WACV (2018) [arXiv][code]

  • Tal Remez, Jonathan Huang, Matthew Brown: "learning to segment via cut-and-paste" ECCV (2018) [arXiv] [code]

Occlusion

  • Samaneh Azadi, Deepak Pathak, Sayna Ebrahimi, Trevor Darrell: "Compositional GAN: Learning Image-Conditional Binary Composition." IJCV (2020) [arXiv] [code]
  • Fangneng Zhan, Jiaxing Huang, Shijian Lu, "Hierarchy Composition GAN for High-fidelity Image Synthesis." Transactions on cybernetics (2021) [arXiv]

Datasets

  • iHarmony4 (image harmonization): It contains four subdatasets: HCOCO, HAdobe5k, HFlickr, Hday2night, with a total of 73,146 pairs of unharmonized images and harmonized images. [pdf] [link]
  • GMSDataset (image harmonization): It contains 183 images with image resolution of 1940*1440. It consists of 16 different objects and for each object, one source image and 11 target images in different background scenes and illumination conditions are captured. [pdf] [link] (access code: ekn2)
  • HVIDIT (image harmonization): A dataset built upon VIDIT (Virtual Image Dataset for Illumination Transfer) dataset for image harmonization. It contains 3007 images of 276 scenes for training and 329 images of 24 scenes for testing. [pdf] [link]
  • RHHarmony (image harmonization): A rendered image harmonization dataset, which contains 15000 ground-truth rendered images and has the potential to generate 135000 composite rendered images. [pdf] [link]
  • Shadow-AR (shadow generation): It contains 3,000 quintuples, Each quintuple consists of 5 images 640×480 resolution: a synthetic image without the virtual object shadow and its corresponding image containing the virtual object shadow, a mask of the virtual object, a labeled real-world shadow matting and its corresponding labeled occluder. [pdf] [link]
  • DESOBA (shadow generation): It contains 840 training images with totally 2,999 object-shadow pairs and 160 test images with totally 624 object-shadow pairs. [pdf] [link]
  • OPA (object placement): It contains 62,074 training images and 11,396 test images, in which the foregrounds/backgrounds in training set and test set have no overlap. The training (resp., test) set contains 21,351 (resp.,3,566) positive samples and 40,724 (resp., 7,830) negative samples. [pdf] [link]

Other resources

Owner
BCMI
Center for Brain-Like Computing and Machine Intelligence, Shanghai Jiao Tong University.
BCMI
Code for the "Sensing leg movement enhances wearable monitoring of energy expenditure" paper.

EnergyExpenditure Code for the "Sensing leg movement enhances wearable monitoring of energy expenditure" paper. Additional data for replicating this s

Patrick S 42 Oct 26, 2022
Usando o Amazon Textract como OCR para Extração de Dados no DynamoDB

dio-live-textract2 Repositório de código para o live coding do dia 05/10/2021 sobre extração de dados estruturados e gravação em banco de dados a part

hugoportela 0 Jan 19, 2022
Repositório para registro de estudo da biblioteca opencv (Python)

OpenCV (Python) Objetivo do Repositório: Registrar avanços no estudo da biblioteca opencv. O repositório estará aberto a qualquer pessoa e há tambem u

1 Jun 14, 2022
An official PyTorch implementation of the paper "Learning by Aligning: Visible-Infrared Person Re-identification using Cross-Modal Correspondences", ICCV 2021.

PyTorch implementation of Learning by Aligning (ICCV 2021) This is an official PyTorch implementation of the paper "Learning by Aligning: Visible-Infr

CV Lab @ Yonsei University 30 Nov 05, 2022
Fast style transfer

faststyle Faststyle aims to provide an easy and modular interface to Image to Image problems based on feature loss. Install Making sure you have a wor

Lucas Vazquez 21 Mar 11, 2022
This is the implementation of the paper "Gated Recurrent Convolution Neural Network for OCR"

Gated Recurrent Convolution Neural Network for OCR This project is an implementation of the GRCNN for OCR. For details, please refer to the paper: htt

90 Dec 22, 2022
Primary QPDF source code and documentation

QPDF QPDF is a command-line tool and C++ library that performs content-preserving transformations on PDF files. It supports linearization, encryption,

QPDF 2.2k Jan 04, 2023
Morphological edge detection or object's boundary detection using erosion and dialation in OpenCV python

Morphologycal-edge-detection-using-erosion-and-dialation the task is to detect object boundary using erosion or dialation . Here, use the kernel or st

Tamzid hasan 3 Nov 25, 2022
Code related to "Have Your Text and Use It Too! End-to-End Neural Data-to-Text Generation with Semantic Fidelity" paper

DataTuner You have just found the DataTuner. This repository provides tools for fine-tuning language models for a task. See LICENSE.txt for license de

81 Jan 01, 2023
🖺 OCR using tensorflow with attention

tensorflow-ocr 🖺 OCR using tensorflow with attention, batteries included Installation git clone --recursive http://github.com/pannous/tensorflow-ocr

646 Nov 11, 2022
Image processing using OpenCv

Image processing using OpenCv Write a program that opens the webcam, and the user selects one of the following on the video: ✅ If the user presses the

M.Najafi 4 Feb 18, 2022
a Deep Learning Framework for Text

DeLFT DeLFT (Deep Learning Framework for Text) is a Keras and TensorFlow framework for text processing, focusing on sequence labelling (e.g. named ent

Patrice Lopez 350 Dec 19, 2022
Source code of RRPN ---- Arbitrary-Oriented Scene Text Detection via Rotation Proposals

Paper source Arbitrary-Oriented Scene Text Detection via Rotation Proposals https://arxiv.org/abs/1703.01086 News We update RRPN in pytorch 1.0! View

428 Nov 22, 2022
Train custom VR face tracking parameters

Pal Buddy Guy: The anipal's best friend This is a small script to improve upon the tracking capabilities of the Vive Pro Eye and facial tracker. You c

7 Dec 12, 2021
A Tensorflow model for text recognition (CNN + seq2seq with visual attention) available as a Python package and compatible with Google Cloud ML Engine.

Attention-based OCR Visual attention-based OCR model for image recognition with additional tools for creating TFRecords datasets and exporting the tra

Ed Medvedev 933 Dec 29, 2022
A python programusing Tkinter graphics library to randomize questions and answers contained in text files

RaffleOfQuestions Um programa simples em python, utilizando a biblioteca gráfica Tkinter para randomizar perguntas e respostas contidas em arquivos de

Gabriel Ferreira Rodrigues 1 Dec 16, 2021
Some bits of javascript to transcribe scanned pages using PageXML

nashi (nasḫī) Some bits of javascript to transcribe scanned pages using PageXML. Both ltr and rtl languages are supported. Try it! But wait, there's m

Andreas Büttner 15 Nov 09, 2022
This is a project to detect gestures to zoom in or out, using the real-time distance between the index finger and the thumb. It's based on OpenCV and Mediapipe.

Pinch-zoom This is a python project based on real-time hand-gesture detection, to zoom in or out, using the distance between the index finger and the

Harshit Bhalla 6 Jul 11, 2022
Assignment work with webcam

work with webcam : Press key 1 to use emojy on your face Press key 2 to use lip and eye on your face Press key 3 to checkered your face Press key 4 to

Hanane Kheirandish 2 May 31, 2022
A novel region proposal network for more general object detection ( including scene text detection ).

DeRPN: Taking a further step toward more general object detection DeRPN is a novel region proposal network which concentrates on improving the adaptiv

Deep Learning and Vision Computing Lab, SCUT 151 Dec 12, 2022