Super Resolution for images using deep learning.

Last update: Dec 29, 2022

Related tags

Overview

Neural Enhance

Example #1 — Old Station: view comparison in 24-bit HD, original photo CC-BY-SA @siv-athens.

As seen on TV! What if you could increase the resolution of your photos using technology from CSI laboratories? Thanks to deep learning and #NeuralEnhance, it's now possible to train a neural network to zoom in to your images at 2x or even 4x. You'll get even better results by increasing the number of neurons or training with a dataset similar to your low resolution image.

The catch? The neural network is hallucinating details based on its training from example images. It's not reconstructing your photo exactly as it would have been if it was HD. That's only possible in Hollywood — but using deep learning as "Creative AI" works and it is just as cool! Here's how you can get started...

Examples & Usage
Installation
Background & Research
Troubleshooting
Frequent Questions

1. Examples & Usage

The main script is called enhance.py, which you can run with Python 3.4+ once it's setup as below. The --device argument that lets you specify which GPU or CPU to use. For the samples above, here are the performance results:

GPU Rendering HQ — Assuming you have CUDA setup and enough on-board RAM to fit the image and neural network, generating 1080p output should complete in 5 seconds, or 2s per image if multiple at the same time.
CPU Rendering HQ — This will take roughly 20 to 60 seconds for 1080p output, however on most machines you can run 4-8 processes simultaneously given enough system RAM. Runtime depends on the neural network size.

The default is to use --device=cpu, if you have NVIDIA card setup with CUDA already try --device=gpu0. On the CPU, you can also set environment variable to OMP_NUM_THREADS=4, which is most useful when running the script multiple times in parallel.

1.a) Enhancing Images

A list of example command lines you can use with the pre-trained models provided in the GitHub releases:

# Run the super-resolution script to repair JPEG artefacts, zoom factor 1:1.
python3 enhance.py --type=photo --model=repair --zoom=1 broken.jpg

# Process multiple good quality images with a single run, zoom factor 2:1.
python3 enhance.py --type=photo --zoom=2 file1.jpg file2.jpg

# Display output images that were given `_ne?x.png` suffix.
open *_ne?x.png

Here's a list of currently supported models, image types, and zoom levels in one table.

FEATURES	`--model=default`	`--model=repair`	`--model=denoise`	`--model=deblur`
`--type=photo`	2x	1x	…	…

1.b) Training Super-Resolution

Pre-trained models are provided in the GitHub releases. Training your own is a delicate process that may require you to pick parameters based on your image dataset.

# Remove the model file as don't want to reload the data to fine-tune it.
rm -f ne?x*.pkl.bz2

# Pre-train the model using perceptual loss from paper [1] below.
python3.4 enhance.py --train "data/*.jpg" --model custom --scales=2 --epochs=50 \
    --perceptual-layer=conv2_2 --smoothness-weight=1e7 --adversary-weight=0.0 \
    --generator-blocks=4 --generator-filters=64

# Train the model using an adversarial setup based on [4] below.
python3.4 enhance.py --train "data/*.jpg" --model custom --scales=2 --epochs=250 \
         --perceptual-layer=conv5_2 --smoothness-weight=2e4 --adversary-weight=1e3 \
         --generator-start=5 --discriminator-start=0 --adversarial-start=5 \
         --discriminator-size=64

# The newly trained model is output into this file...
ls ne?x-custom-*.pkl.bz2

Example #2 — Bank Lobby: view comparison in 24-bit HD, original photo CC-BY-SA @benarent.

2. Installation & Setup

2.a) Using Docker Image [recommended]

The easiest way to get up-and-running is to install Docker. Then, you should be able to download and run the pre-built image using the docker command line tool. Find out more about the alexjc/neural-enhance image on its Docker Hub page.

Here's the simplest way you can call the script using docker, assuming you're familiar with using -v argument to mount folders you can use this directly to specify files to enhance:

# Download the Docker image and show the help text to make sure it works.
docker run --rm -v `pwd`:/ne/input -it alexjc/neural-enhance --help

Single Image — In practice, we suggest you setup an alias called enhance to automatically expose the folder containing your specified image, so the script can read it and store results where you can access them. This is how you can do it in your terminal console on OSX or Linux:

# Setup the alias. Put this in your .bashrc or .zshrc file so it's available at startup.
alias enhance='function ne() { docker run --rm -v "$(pwd)/`dirname ${@:$#}`":/ne/input -it alexjc/neural-enhance ${@:1:$#-1} "input/`basename ${@:$#}`"; }; ne'

# Now run any of the examples above using this alias, without the `.py` extension.
enhance --zoom=1 --model=repair images/broken.jpg

Multiple Images — To enhance multiple images in a row (faster) from a folder or wildcard specification, make sure to quote the argument to the alias command:

# Process multiple images, make sure to quote the argument!
enhance --zoom=2 "images/*.jpg"

If you want to run on your NVIDIA GPU, you can instead change the alias to use the image alexjc/neural-enhance:gpu which comes with CUDA and CUDNN pre-installed. Then run it within nvidia-docker and it should use your physical hardware!

2.b) Manual Installation [developers]

This project requires Python 3.4+ and you'll also need numpy and scipy (numerical computing libraries) as well as python3-dev installed system-wide. If you want more detailed instructions, follow these:

Linux Installation of Lasagne (intermediate)
Mac OSX Installation of Lasagne (advanced)
Windows Installation of Lasagne (expert)

Afterward fetching the repository, you can run the following commands from your terminal to setup a local environment:

# Create a local environment for Python 3.x to install dependencies here.
python3 -m venv pyvenv --system-site-packages

# If you're using bash, make this the active version of Python.
source pyvenv/bin/activate

# Setup the required dependencies simply using the PIP module.
python3 -m pip install --ignore-installed -r requirements.txt

After this, you should have pillow, theano and lasagne installed in your virtual environment. You'll also need to download this pre-trained neural network (VGG19, 80Mb) and put it in the same folder as the script to run. To de-install everything, you can just delete the #/pyvenv/ folder.

Example #3 — Specialized super-resolution for faces, trained on HD examples of celebrity faces only. The quality is significantly higher when narrowing the domain from "photos" in general.

3. Background & Research

This code uses a combination of techniques from the following papers, as well as some minor improvements yet to be documented (watch this repository for updates):

Special thanks for their help and support in various ways:

Eder Santana — Discussions, encouragement, and his ideas on sub-pixel deconvolution.
Andrew Brock — This sub-pixel layer code is based on his project repository using Lasagne.
Casper Kaae Sønderby — For suggesting a more stable alternative to sigmoid + log as GAN loss functions.

4. Troubleshooting Problems

Can't install or Unable to find pgen, not compiling formal grammar.

There's a Python extension compiler called Cython, and it's missing or improperly installed. Try getting it directly from the system package manager rather than PIP.

FIX: sudo apt-get install cython3

NotImplementedError: AbstractConv2d theano optimization failed.

This happens when you're running without a GPU, and the CPU libraries were not found (e.g. libblas). The neural network expressions cannot be evaluated by Theano and it's raising an exception.

FIX: sudo apt-get install libblas-dev libopenblas-dev

TypeError: max_pool_2d() got an unexpected keyword argument 'mode'

You need to install Lasagne and Theano directly from the versions specified in requirements.txt, rather than from the PIP versions. These alternatives are older and don't have the required features.

FIX: python3 -m pip install -r requirements.txt

ValueError: unknown locale: UTF-8

It seems your terminal is misconfigured and not compatible with the way Python treats locales. You may need to change this in your .bashrc or other startup script. Alternatively, this command will fix it once for this shell instance.

FIX: export LC_ALL=en_US.UTF-8

Example #4 — Street View: view comparison in 24-bit HD, original photo CC-BY-SA @cyalex.

Comments

remove deprecated package from enhance.py

scipy.ndimage.imread was deprecated in 2017 with the release of scipy 1.0 and finally removed in 1.2.0 (see issue #229)

There is however documentation for transitioning to imageio here The most relevant change is instead of mode use the pilmode keyword argument.

imageio is documented here

opened by JarradTait 2
readme: Fix docker alias

Some users have file not found issues because path isn't correct. This is because the shell alias evaluates the $(pwd) at alias evaluation, and not at command execution time. Changing this to single quotes fixes this.

This resolves GH#28 and GH#17.

opened by purpleidea 2
Fix histogram matching with scipy 0.17.0

WIth my setup (scipy 0.17.0) option --rendering-histogram distorted colors in regions where a color component attained its maximum value. This PR corrects that.

opened by AlexeyKruglov 1
image resolution check

When training on larger --batch-resolution than 300, some images in the OpenImages dataset are too small. This PR fixes those in the same way that corrupted images are ignored.

opened by graphific 1
Remove cnmem theano flag

If you're sharing your GPU with your display, using 100% of memory with lib.cnmem=1 fails with CNMEM_STATUS_OUT_OF_MEMORY. I was able to make cnmem work with a 0.8 value, but not 0.9. I think it depends on the size of your GPU mem vs the resolution of your display so there's no 'best value.'

Since the comment says if you know what you're doing you can change it, then it's probably best to have failsafe defaults for beginners, so this just doesn't use cnmem at all.

fixes #19

opened by msfeldstein 1
Move generation of seeds out of training network

This moves the generation of the image seeds out of the training network and into the DataLoader. Currently seeds are computed as a bilinear downsampling of the original image.

This is almost functionally equivalent to the version it replaces, but opens up new possibilities at training time because the seeds are now decoupled from the network. For example, seeds could be made with different interpolations or even with other transformations such as image compression.

opened by dribnet 1
added --images-glob and added feedback on training images

Added --images-glob so that training images could be specified explicitly. Still defaults to 'dataset//.jpg' as before.

Added a check that the number of training files is not zero. Now issues an error instead of going into infinite loop. The number of training images found is also printed out to the console.

opened by dribnet 1

Fix duplicate param definition

Caused by https://github.com/alexjc/neural-enhance/commit/203917d1227e3c5b26668aefe481cc1756bee42f - currently producing this error:

Traceback (most recent call last):
  File "enhance.py", line 40, in <module>
    add_arg('--model',              default='small', type=str,          help='Name of the neural network to load/save.')
  File "/opt/conda/lib/python3.5/argparse.py", line 1344, in add_argument
    return self._add_action(action)
  File "/opt/conda/lib/python3.5/argparse.py", line 1707, in _add_action
    self._optionals._add_action(action)
  File "/opt/conda/lib/python3.5/argparse.py", line 1548, in _add_action
    action = super(_ArgumentGroup, self)._add_action(action)                                                                                                              
  File "/opt/conda/lib/python3.5/argparse.py", line 1358, in _add_action                                                                                                  
    self._check_conflict(action)                                                                                                                                          
  File "/opt/conda/lib/python3.5/argparse.py", line 1497, in _check_conflict                                                                                              
    conflict_handler(action, confl_optionals)
  File "/opt/conda/lib/python3.5/argparse.py", line 1506, in _handle_conflict_error
    raise ArgumentError(action, message % conflict_string)
argparse.ArgumentError: argument --model: conflicting option string: --model

opened by OndraM 0

Work without deprecated SciPy methods (Py3.9+)

-In the last version SciPy some methods have been deprecated. -Methods have been replaced to PIL* & imageio*

--------------------More details--------------------

-The read method has been replaced from "scipy.ndimage.imread(filename, mode='RGB')" to "imageio.imread(filename, as_gray=False, pilmode="RGB")" -The return buffer has been replaced from "scipy.misc.toimage(output, cmin=0, cmax=255)" to "PIL.Image.fromarray((output).astype('uint8'), mode='RGB')"

opened by xavetar 0

Releases(v0.3)

v0.3(Nov 13, 2016)
This is the third public version of the project, download it from revision [9d2aa3c] or tag v0.3.

Model files are available for download below:

Photo Zoom 2x — 8 residual blocks of 128 filters each, zoom factor 2:1.

Photo Zoom 4x — 8 residual blocks of 128 filters each, zoom factor 4:1.

Photo Repair 1x — 8 residual blocks of 128 filters each, zoom factor 1:1.

Photo Deblur 1x — 8 residual blocks of 128 filters each, zoom factor 1:1.

These are trained with parameters that can be reproduced using steps in train/*.sh.
Source code(tar.gz)
Source code(zip)
ne1x-photo-deblur-0.3.pkl.bz2(4.92 MB)
ne1x-photo-repair-0.3.pkl.bz2(4.91 MB)
ne2x-photo-default-0.3.pkl.bz2(4.47 MB)
ne4x-photo-default-0.3.pkl.bz2(4.03 MB)
v0.2(Nov 3, 2016)
This second public version of the script, download it from revision [b24fa74] or tag v0.2.

Model files are available for download below:

Small 1x — 8 residual blocks of 64 filters each, zoom factor 1:1. Small 2x — 8 residual blocks of 256 filters each, zoom factor 2:1. Large 2x — 8 residual blocks of 256 filters each, zoom factor 2:1.

These are trained with the exact same parameters, and can be reproduced using steps in scripts/*.sh.
Source code(tar.gz)
Source code(zip)
ne1x-small-0.2.pkl.bz2(2.02 MB)
ne2x-large-0.2.pkl.bz2(30.20 MB)
ne2x-small-0.2.pkl.bz2(1.91 MB)
v0.1(Oct 28, 2016)
The first public version of the script, download it from revision [b1c054c] or tag v0.1.

Model files are available for download below:

Small — 8 residual blocks of 64 filters each.

Medium — 6 residual blocks of 128 filters each.

Large — 12 residual blocks of 128 filters each.

These are trained with varying parameters and duration, and provide different results.
Source code(tar.gz)
Source code(zip)
ne4x-large-0.1.pkl.bz2(7.93 MB)
ne4x-medium-0.1.pkl.bz2(4.98 MB)
ne4x-small-0.1.pkl.bz2(1.80 MB)

Owner

Alex J. Champandard

Artificial Intelligence specialist, co-Founded creative.ai, Director nucl.ai conference, Deep Learning, ex-R☆/Guerrilla Games Senior AI Programmer.

GitHub Repository

BasicNeuralNetwork - This project looks over the basic structure of a neural network and how machine learning training algorithms work

BasicNeuralNetwork - This project looks over the basic structure of a neural network and how machine learning training algorithms work. For this project, I used the sigmoid function as an activation

1 Jan 22, 2022

This provides the R code and data to replicate results in "The USS Trustee’s risky strategy"

USSBriefs2021 This provides the R code and data to replicate results in "The USS Trustee’s risky strategy" by Neil M Davies, Jackie Grant and Chin Yan

1 Oct 30, 2021

ShinRL: A Library for Evaluating RL Algorithms from Theoretical and Practical Perspectives

Status: Under development (expect bug fixes and huge updates) ShinRL: A Library for Evaluating RL Algorithms from Theoretical and Practical Perspectiv

37 Dec 28, 2022

DualGAN-tensorflow: tensorflow implementation of DualGAN

ICCV paper of DualGAN DualGAN: unsupervised dual learning for image-to-image translation please cite the paper, if the codes has been used for your re

252 Nov 10, 2022

Code for Mining the Benefits of Two-stage and One-stage HOI Detection

Status: Archive (code is provided as-is, no updates expected) PPO-EWMA [Paper] This is code for training agents using PPO-EWMA and PPG-EWMA, introduce

33 Dec 15, 2022

Official codebase for ICLR oral paper Unsupervised Vision-Language Grammar Induction with Shared Structure Modeling

CLIORA This is the official codebase for ICLR oral paper: Unsupervised Vision-Language Grammar Induction with Shared Structure Modeling. We introduce

32 Dec 23, 2022

Pytorch implementation for M^3L

Learning to Generalize Unseen Domains via Memory-based Multi-Source Meta-Learning for Person Re-Identification (CVPR 2021) Introduction This is the Py

45 Dec 26, 2022

[NeurIPS-2020] Self-paced Contrastive Learning with Hybrid Memory for Domain Adaptive Object Re-ID.

Self-paced Contrastive Learning (SpCL) The official repository for Self-paced Contrastive Learning with Hybrid Memory for Domain Adaptive Object Re-ID

286 Dec 21, 2022

Official implementation of "Robust channel-wise illumination estimation"

This repository provides the official implementation of "Robust channel-wise illumination estimation." accepted in BMVC (2021).

4 Nov 08, 2022

Sparse Progressive Distillation: Resolving Overfitting under Pretrain-and-Finetune Paradigm

Sparse Progressive Distillation: Resolving Overfitting under Pretrain-and-Finetu

3 Dec 05, 2022

This project provides a stock market environment using OpenGym with Deep Q-learning and Policy Gradient.

Stock Trading Market OpenAI Gym Environment with Deep Reinforcement Learning using Keras Overview This project provides a general environment for stoc

769 Dec 25, 2022

The repo for reproducing Seed-driven Document Ranking for Systematic Reviews: A Reproducibility Study

ECIR Reproducibility Paper: Seed-driven Document Ranking for Systematic Reviews: A Reproducibility Study This code corresponds to the reproducibility

3 Mar 31, 2022

Code for the Paper "Diffusion Models for Handwriting Generation"

62 Dec 21, 2022

A privacy-focused, intelligent security camera system.

Self-Hosted Home Security Camera System A privacy-focused, intelligent security camera system. Features: Multi-camera support w/ minimal configuration

175 Jan 01, 2023

🍀 Pytorch implementation of various Attention Mechanisms, MLP, Re-parameter, Convolution, which is helpful to further understand papers.⭐⭐⭐

7.7k Jan 05, 2023

Mosaic of Object-centric Images as Scene-centric Images (MosaicOS) for long-tailed object detection and instance segmentation.

MosaicOS Mosaic of Object-centric Images as Scene-centric Images (MosaicOS) for long-tailed object detection and instance segmentation. Introduction M

27 Oct 12, 2022

Light-Head R-CNN

Light-head R-CNN Introduction We release code for Light-Head R-CNN. This is my best practice for my research. This repo is organized as follows: light

835 Dec 06, 2022

Estimating Example Difficulty using Variance of Gradients

Estimating Example Difficulty using Variance of Gradients This repository contains source code necessary to reproduce some of the main results in the

48 Dec 26, 2022

Code for the paper titled "Generalized Depthwise-Separable Convolutions for Adversarially Robust and Efficient Neural Networks" (NeurIPS 2021 Spotlight).

Generalized Depthwise-Separable Convolutions for Adversarially Robust and Efficient Neural Networks This repository contains the code and pre-trained

7 Dec 05, 2022

Code for A Volumetric Transformer for Accurate 3D Tumor Segmentation

VT-UNet This repo contains the supported pytorch code and configuration files to reproduce 3D medical image segmentaion results of VT-UNet. Environmen

114 Dec 20, 2022