当前位置:网站首页>singan:learning a generative model from a single natural image
singan:learning a generative model from a single natural image
2022-07-18 13:32:00 【Kun Li】
singan My task is Internal Learning, A similar graph can be generated from noise through a graph , In fact, I had hoped singan Can do image completion , The complement here , It refers to the extension of the image , It was originally possible that only part of the image was not long enough on both sides , Extend the , Including in the intelligent size expansion, if you can generate a more realistic scene , In fact, it will have a very amazing effect , But after reading the full text ,singan With the help of pyramid model , from patch Learn implicit information in , Thus, image information can be modeled from a single image .

1.abstract
Unconditional gan, Our model can grasp the internal distribution of image blocks ,singan Contains a pyramid full convolution GAN, Every GAN Responsible for learning each scale patch Distribution , Allow to generate new samples of any size and aspect ratio , These new samples have significant variability , At the same time, it can keep the whole structure and fine texture of the training image , From this point, it seems that we can also do some complementary tasks , Open the size of the image .
2.introduction
This paper introduces a new field , Unconditional learning from a single natural image gan, In a single image patch The internal statistics of the block are sufficient for learning a powerful generation model , Long term later , For a single natural image patch Modeling block distribution is a challenge , For example, noise elimination , To blur , Over score , Defogging and image editing .

2 related work
2.1 single image deep models.
Over fitting the depth model into a single training example .
2.2 generative models for image manipulation
We are not interested in capturing common features between the same kind of images , Instead, consider different training data - All overlapping blocks are on multiple scales of a single natural image , A powerful generation model can be learned from this data .
3.Methods
Learn a single training image x Unconditional of internal statistics gan. And conventional gan The difference is that the training sample here is a single image patch, Not the whole image .
We choose go beyond texture generation. This requires many different scales to capture the statistical information of complex image structures , Capture global attributes , The arrangement and shape of large objects in the picture, as well as fine details and texture information . As shown in the figure below , from patch-GANs The hierarchy of ,patch-GAN We can catch the samples on different scales patch Distribution .

3.1 Multi-scale architecture
Gn and Dn It's a pair. , The generation of image samples starts from the coarsest level , And pass through all generators in turn until the most delicate level , And noise is injected at every level . All generators and discriminators have the same receptive field , So as the generation process goes on , The size of receptive field gradually decreases , As shown in the figure above , On the coarsest scale , Generation starts on noise , Map white Gaussian noise to image samples , Above, singan Core diagram of , How to implement at the code level patch Well ? Our images are usually patches Input , Is to cut the picture into several patches, Find the characteristics respectively ,singan There is a difference .singan The idea of using pyramids in data processing , The original input image resize become 9 A scale , however 9 The first measure is resize Of , So they are all original pictures , In the input to 8 There are three levels of gan On , Input a picture into a field with 11x11 Feel the wild network , The value of each location output by the network , Equivalent to the original drawing centered on this position , The size is 11x11 The output value of the image block input into the network .
The specific description details are as follows :
- We mentioned earlier ,GAN The distribution of real data is estimated .
- An image is a sample point sampled from the real data distribution .
- We cannot estimate the original distribution from only one sample point . Just like now there is an average 、 Normal distribution with unknown variance , Tell you a sample value randomly sampled from this distribution , We cannot estimate the mean and variance of the original distribution .
- An intuitive idea is , We can cut this image into pieces , So we have many images !
- Cut into how big pieces ?
- Suppose we cut it into smaller pieces . for instance , about 200x200 Resolution image , Cut into several 11x11 The block . How many can you cut at most ? More than 30000 . Use more than 30000 11x11 Small image block , Train one GAN, Generate near real 11x11 Resolution image block . Sounds very reliable ? But we don't know how to put these small pieces together to form a high-resolution image ……
- Suppose we cut it into larger pieces . for instance , about 200x200 Resolution image , We cut it into several pieces 150x150 Resolution block . How many can you cut at most ? about 2500 individual . How big is the sample space ?256^(150x150x3). The sample space is too large compared with the number of samples we have , let me put it another way , It's hard to just use 2500 Images can be generated by training with a resolution of 150x150 The real image of ( Compared with this ,MNIST The data set is the resolution 28x28 Binary graph of , Training data are available 55000 Zhang ).
- 11x11 The image block of is easy to train , But the captured content is too fine , It is not helpful for generating the overall image ;150x150 The image block of depicts the overall information of the image , But it's too difficult to train ……
- Maybe , We can reduce the image first , Then cut out large image blocks to train . In this way, the image block resolution is low , Easy to train , And it also retains the overall information of the image . for instance , For the original 200x200 Resolution image , Let's downsample to 40x40, Cut into several pieces 11x11 The block , Can cut out about 800 individual , To use this 800 Multiple image block training GAN, The generated image block should retain the overall layout information of the image .
- Although the generated image is very blurred , But we have a general structure of the whole , Just add details to the image . It's not hard to imagine. , We cut it out of the original image before 11x11 Small image block training GAN The model makes sense , It contains the details we need at present .
- Add a few more scales , Gradually add details from the most blurred image .
- We have it. SinGAN.
The effective receptive field of each level is usually half the height of the picture , therefore Gn It will generate the overall layout of the image and the global structure of the object .

In addition to input noise at each level , The output diagram of the previous level generator is also input .
3.2 training
From rough layer to detail layer , Train the multi-scale architecture in turn , Train every GAN after , Will remain fixed , Right. n individual GAN The training loss of includes a confrontation loss and a reconstruction loss :

On the structure , We can see that in addition to the loss of gradient punishment , Or one more reconstruction loss , The reconstruction loss may be more due to the stability of training .
边栏推荐
- Working people working soul! Here comes the sales analysis case!
- Halcon 3D create_pose
- 迁移学习在医学图像分类中的研究进展
- Global location number Gln application introduction
- [machine learning] decision tree
- ClickHouse(04)如何搭建ClickHouse集群
- 剑指 Offer 48. 最长不含重复字符的子字符串
- 三匹马携手乾元公益基金会 | 炎夏送清凉,致敬坚守者 !
- 力扣暑假刷题
- Excel, how to choose the right chart?
猜你喜欢

After Jay Chou's co branded model, Fang Wenshan brought online shopping to promote the collection of "Hualiu" top-level co branded dolls

软件测试零基础测试篇-基本概念

剑指 Offer 10- II. 青蛙跳台阶问题

动态炫酷的404页面源码

三匹马携手乾元公益基金会 | 炎夏送清凉,致敬坚守者 !

How to restore the files deleted from the U disk
![[notes] cryptography from introduction to earth | AES](/img/f8/6649d68018dad3c57b0864b4be0edd.png)
[notes] cryptography from introduction to earth | AES

Gan online learning notes

Dynamic cool 404 page source code

ES6 note 2
随机推荐
Research progress of transfer learning in medical image classification
Memory management page properties
Zhongang Mining: Fluorite guarantees the supply of fluorine in new energy industry
剑指 Offer 46. 把数字翻译成字符串
How to generate non repeated random numbers in Excel, multi method + principle
[software testing] 08 -- black box testing method (boundary value analysis method, cause and effect diagram and decision table method)
MySQL triggers and stored procedures
Common protocols of tcp/ip
130 多家企业,3500 多位开发者,共建 openGauss 开源数据库根社区
How to customize an annotation?
使用pssh批量在多台主机上执行命令
【深度学习】《动手学深度学习》环境配置
Gan online learning notes
Detailed explanation of some functions with similar functions in MySQL
singan:learning a generative model from a single natural image
【机器学习】决策树 – Decision Tree
Static routing technology
Community summit pulsar summit old golden peak conference topic highlights exposure!
剑指 Offer 10- II. 青蛙跳台阶问题
USB protocol (I)