当前位置:网站首页>Face technology: the picture of unclear people is repaired into a high-quality and high-definition image framework (with source code download)
Face technology: the picture of unclear people is repaired into a high-quality and high-definition image framework (with source code download)
2022-07-19 15:11:00 【Computer Vision Research Institute】
Pay attention to the parallel stars
Never get lost
Institute of computer vision



official account ID|ComputerVisionGzq
Study Group | Scan the code to get the join mode on the homepage

Address of thesis :https://arxiv.org/pdf/2201.06374.pdf
Code address :https://github.com/wzhouxiff/RestoreFormer.git
Computer Vision Institute column
author :Edison_G
Blind face Restoration is to restore high-quality face images from unknown degradation . Because face images contain rich contextual information , Researchers have proposed a method ,RestoreFormer, It explores the full spatial attention of modeling contextual information , And it goes beyond the existing work of using local operators .
01
summary
Blind face Restoration is to restore high-quality face images from unknown degradation . Because face images contain rich contextual information , Researchers have proposed a method ,RestoreFormer, It explores the full spatial attention of modeling contextual information , And it goes beyond the existing work of using local operators .

Compared with the prior art ,RestoreFormer There are several benefits . First , Compared with the previous Vision Transformers(ViT) The traditional multi head self attention is different ,RestoreFormer A multi head cross attention layer is merged to learn the full space interaction between damaged queries and high-quality key value pairs . secondly ,ResotreFormer The key value pairs in are sampled from a high-quality reconstruction oriented dictionary , Its elements are rich , It has high-quality face features specially for face reconstruction , Thus, it has excellent recovery effect . Third ,RestoreFormer Superior to advanced state-of-the-art methods on one synthetic dataset and three real-world datasets , And generate images with better visual quality .
02
background
Blind face Recovery aims to recover from the complex and diverse degradation that has been suffered ( Sample as follows 、 Fuzzy 、 noise 、 Compress artifacts, etc ) Restore high-quality faces from degraded faces . Because degradation is unknown in the real world , So recovery is a challenging task .Blind face Restoration aims to restore high-quality faces from complex and unknown degradation . Previous work shows that , Additional priors play a crucial role in this task , They can be roughly divided into three types : The geometric 、 A priori and generative a priori .
Methods based on geometric priors tend to use landmark Heat map or face component heat map gradually restores the face . Because these geometric priors are mainly generated from low-quality faces , Therefore, the damaged face limits the performance of recovery . On the other hand , Reference based works need to have the same identity as the degenerated face , This is not always accessible . Although some researchers have alleviated this limitation by collecting component dictionaries composed of high-quality facial component features as general references , The facial details in these component dictionaries are limited , Because they are extracted with models for offline recognition , And only pay attention to some facial components .
Vision Transformer.Transformer It is a deep neural network originally used in the field of natural language processing . Because of its competitive presentation ability , It began to be applied to computer vision tasks , For example, identification 、 Detection and segmentation . In some papers , Low level visual tasks also benefit . Some researchers use Transformer Advantages in large-scale pre training , Build a complex model , It covers multiple image processing tasks , For example, denoising 、 Rain removal and super resolution . Ethel et al 【Patrick Esser, Robin Rombach, and Bjorn Ommer. Taming transformers for high-resolution image synthesis】 application transformer High resolution images are generated by predicting a series of codebook indexes of its encoder , Make full use of strong representativeness transformer Capacity within acceptable computing resources . stay 【Mingrui Zhu, Changcheng Liang, Nannan Wang, Xiaoyu Wang, Zhifeng Li, and Xinbo Gao. A sketch-transformer network for face photo-sketch synthesis】 in , use transformer Get the global structure of the face , Help photo-sketch Synthesis .
03
New framework analysis

(a)MHSA It is a kind of self attention with multiple heads transformer, Used in most previous ViT. Its query 、 Keys and values come from degraded information Zd.(b)MHCA It is a multi headed cross attention transformer, For proposed RestoreFormer. It aims to pass Zd As a query , take Zp As a key value pair , Integrate degraded information in space Zd And its corresponding high-quality priors Zp.(c) yes RestoreFormer The whole process of . First deploy the encoder Ed To extract degenerate faces Id It means Zd, And from HQ Dictionaries D Extract its recent high-quality priors Zp. Then use two MHCA Fuse degenerate features Zd And a priori Zp. Last , In fusion, it means Z0f On the application decoder Dd To restore high-quality faces Id.

Comparison of Prior Dictionary.(a)DFDNet The component dictionary proposed in is composed of VGG Generated offline by the network , And use K-means Clustering . They only think about eyes 、 Nose and mouth .(b) Today, researchers put forward HQ Dictionary It is learned through the high-quality face generation network combined with the idea of vector quantization .HQ Dictionary The high-quality priors in are reconstruction oriented , Provide more face details for the restoration of degraded faces . Besides HQ Dictionary A priori in involves all facial regions .
04
Experiment and visualization


THE END
Please contact the official account for authorization.

The learning group of computer vision research institute is waiting for you to join !
We created “ Computer vision society ” Knowledge planet has more than two years , It has also been recognized by many students , Recently, we started the operation of knowledge planet . We Regular meeting Push practical content to share with you , Students on the planet can Ask questions at any time , Be ready to ask for it , We will give timely reply and corresponding reply .

ABOUT
Institute of computer vision
The Institute of computer vision is mainly involved in the field of deep learning , Mainly devoted to face detection 、 Face recognition , Multi target detection 、 Target tracking 、 Image segmentation and other research directions . The Research Institute will continue to share the latest paper algorithm new framework , The difference of our reform this time is , We need to focus on ” Research “. After that, we will share the practice process for the corresponding fields , Let us really experience the real scene of getting rid of the theory , Develop the habit of hands-on programming and brain thinking !
VX:2311123606

Previous recommendation
SSD7 | Embedded friendly target detection network , Product landing
Accuracy improvement method : The adaptive Tokens Efficient vision Transformer frame ( Open source )
ONNX elementary analysis : How to accelerate the engineering of deep learning algorithm ?
Improved shadow suppression for illumination robust face recognition
Text driven for creating and editing images ( With source code )
Based on hierarchical self - supervised learning, vision Transformer Scale to gigapixel images
边栏推荐
- 2. MySQL introduction
- Tianqin Chapter 9 after class exercise code
- Leetcode 1275. 找出井字棋的獲勝者
- SBOM(Software Bill of Materials,软件物料清单)
- Icml2022 | géométrie multimodale Contrastive Representation Learning
- Scheduled tasks, VIM directly creates and modifies users
- 证券账户上买基金安全吗,我要做基金定投
- 揭开服务网格~Istio Service Mesh神秘的面纱
- Li Hongyi machine learning 2022.7.15 -- gradient descent
- Natural language processing model of bigscience open source bloom
猜你喜欢

Leetcode 1275. 找出井字棋的获胜者

Comparison of two virtual machines

How to quickly realize Zadig single sign on on authoring?

kube-proxy & Service & Endpoint

Google Earth Engine——无人机影像进行分类处理

ICML2022 | 幾何多模態對比錶示學習

Comparaison de deux types de machines virtuelles

Domestic fpga/dsp/zynq Chip & board scheme
![[port 3000 is already in use, solution to the problem of 3000 port being occupied]](/img/6f/6c8fdbc6b0b2794433c97e77185111.png)
[port 3000 is already in use, solution to the problem of 3000 port being occupied]

微信小程序9-发布代码
随机推荐
一次函数 T1744 963字符写法
009 面试题 SQL语句各部分的执行顺序
中断的分类
End repeated development and personalize the login system in twoorthree times
08_服务熔断Hystrix
论文阅读 TEMPORAL GRAPH NETWORKS FOR DEEP LEARNING ON DYNAMIC GRAPHS
009 execution sequence of SQL statement of interview questions
通过授权微信,达到软件登录账号的效果~~未完
Mongodb partition cluster construction
Compositionapi component development paradigm
暑期第三周总结
PKI:TLS握手
One article, teach you to achieve single sign on
3U VPX cooling conduction high performance srio/ Ethernet data exchange board
Natural language processing model of bigscience open source bloom
[GYM103660] The 19th Zhejiang University City College Programming Contest 浙大城市学院校赛VP/S
A - trees on the level
Scheduled tasks, VIM directly creates and modifies users
[flask introduction series] request hook and context
Achieve the effect of software login account by authorizing wechat ~ ~ unfinished