An architecture that makes any doodle realistic, in any specified style, using VQGAN, CLIP and some basic embedding arithmetics.

Last update: Dec 18, 2022

Related tags

Deep Learning Sketch-Simulator

Overview

Sketch Simulator

An architecture that makes any doodle realistic, in any specified style, using VQGAN, CLIP and some basic embedding arithmetics.

See the final cell output of the colab below for some examples with and without subtracting sketch embedding averages.

WARNING: This colab is messy, a precursor of the code in this repo, but it works.

Architecture Overview

Setup

run ./setup.sh in your environment. This will install required libraries and download model weights.

Usage

To work a single doodle, in your desired style (see train.py for all avaible modifiers), run:
- train.py --start_image "path/to/your/doodle" --prompts "a painting in the style of ... | Trending on artstation
Prompts are split using "|", and specific weights can be assigned using {prompt1}:{weight1}|{prompt2}:{weight2}
To explore the hyperparameter space or large amounts of doodles and / or promps using weights and biases:
- Create a sweep config with your desired parameters your_sweep.yaml in sweep_configs/ (see sweep_configs/* for examples)
- Start the sweep:
  - wandb sweep -p Sketch-sim "\path\to\your_sweep.yaml" (this returns the sweep_ID, to be used in the next command)
  - wandb agent janzuiderveld/Sketch-sim/sweep_ID''
- Alternatively, when working in SLURM environments, one can utilize `SLURM_scripts/sweeper.sh' (make sure to edit paths appropriately):
  - sbatch SLURM_scripts/sweeper.sh "path/to/your_sweep.yaml"

All outputs are saved in outputs/{args.experiment_name}/step_{i}.png

Calculate Average Sketch Embedding

To (re)calculate average sketch embeddings (results/ovl_mean_sketch.pth is calculated based on 1000 (padded) items per class for all 350 quickdraw classes) run:
- extract_sketch_emb.py --items_per_class 1000 --save_root "path/to/repo/root" --pad_images 6

Notes

1 step of synthesizing + embedding 400x400 images takes about 0.3 seconds on a single 1080, usually 20-30 steps is enough for nice results.
Prompts can be used as a metric in large hyperparameter sweeps (their scores are automatically logged) by using a weight of 0.

TODO

Add server / client scripts to circumvent startup times
Add CLIP-based classifier for testing conceptual embedding accuracy on Quickdraw classification

An architecture that makes any doodle realistic, in any specified style, using VQGAN, CLIP and some basic embedding arithmetics.

Related tags

Overview

Sketch Simulator

Architecture Overview

Setup

Usage

Calculate Average Sketch Embedding

Notes

TODO

Owner

An open source object detection toolbox based on PyTorch

Python Tensorflow 2 scripts for detecting objects of any class in an image without knowing their label.

Offical implementation for "Trash or Treasure? An Interactive Dual-Stream Strategy for Single Image Reflection Separation".

Official repository accompanying a CVPR 2022 paper EMOCA: Emotion Driven Monocular Face Capture And Animation. EMOCA takes a single image of a face as input and produces a 3D reconstruction. EMOCA sets the new standard on reconstructing highly emotional images in-the-wild

MarcoPolo is a clustering-free approach to the exploration of bimodally expressed genes along with group information in single-cell RNA-seq data

This is a beginner-friendly repo to make a collection of some unique and awesome projects. Everyone in the community can benefit & get inspired by the amazing projects present over here.

Code and datasets for the paper "Combining Events and Frames using Recurrent Asynchronous Multimodal Networks for Monocular Depth Prediction" (RA-L, 2021)

Implementation of neural class expression synthesizers

ICCV2021 Expert-Goal Trajectory Prediction

Configure SRX interfaces with Scrapli

Nested cross-validation is necessary to avoid biased model performance in embedded feature selection in high-dimensional data with tiny sample sizes

PyTorch implementation of: Michieli U. and Zanuttigh P., "Continual Semantic Segmentation via Repulsion-Attraction of Sparse and Disentangled Latent Representations", CVPR 2021.

Probabilistic Entity Representation Model for Reasoning over Knowledge Graphs

Ἀνατομή is a PyTorch library to analyze representation of neural networks

face property detection pytorch

An Implementation of Transformer in Transformer in TensorFlow for image classification, attention inside local patches

OpenMMLab's Next Generation Video Understanding Toolbox and Benchmark

HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis

TalkingHead-1KH is a talking-head dataset consisting of YouTube videos

Pytorch implementation of Hinton's Dynamic Routing Between Capsules