当前位置:网站首页>Improvement 20 of yolov5: introduction of new neural network operator into network
Improvement 20 of yolov5: introduction of new neural network operator into network
2022-07-18 20:39:00 【Artificial Intelligence Algorithm Research Institute】
front said : As the current advanced deep learning target detection algorithm YOLOv5, A large number of trick, But there is still room for improvement , For the detection difficulties in specific application scenarios , There are different ways to improve . Subsequent articles , Focus on YOLOv5 How to improve is introduced in detail , The purpose is to provide their own meager help and reference for those who need innovation in scientific research or friends who need to achieve better results in engineering projects .
You are welcome to pay attention to more procedural information and answer questions —— WeChat official account : Artificial intelligence AI Algorithm engineer
solve the problem : For classical convolution conv, It has always been a mainstay in neural networks , It has two characteristics :spatial-agnostic and channel-specific( Space agnostic and channel specific ). In space : The former ensures the efficiency of convolution kernel by reusing convolution kernel between different positions , And pursue translational equivalence . In the channel domain : A series of convolution kernels are responsible for collecting all kinds of information encoded in different channels , Meet the latter characteristic . since VGGNet Since its appearance , Modern neural networks limit the space span of convolution kernel to no more than 3*3 To satisfy the compactness of convolution kernel .conv shortcoming :1、 It deprives the convolution kernel of the ability to adapt to different visual patterns in different spatial positions .2、 Locality limits the receptive field of convolution , Challenge small targets or blurred images .3、 The redundancy problem in convolution filter is prominent , The flexibility of convolution kernel is affected . In order to overcome these limitations of these classical convolutions , So the author of this paper puts forward involution The concept of involution During the operation , Inner volumes are space specific and channel agnostic “spatial-specific and channel-agnostic”( And convolution contrary ), The inner convolution nucleus is different in space , But it's shared on the channel , The authors also share on channel dimensions involution Core to reduce core redundancy . Dynamic parameterization involution The core has extensive coverage in spatial dimension . Through reverse design , What this article puts forward involution It has the double advantages of convolution :1、 Context semantic information can be aggregated in a broader space , Thus, the difficulty of modeling remote interaction is overcome ;2、 The weight can be adaptively distributed in different positions , Thus, the visual elements with the most abundant information in the spatial domain are prioritized .
principle :
Thesis link :https://arxiv.org/abs/2103.06255
github Code link :https://github.com/d-li14/involution
Compared with the above standard convolution or depth convolution , Convolution kernel H The purpose is to include transformations with inverse characteristics in space and channel domain , Hence the name .( Give... In your paper involution The explanation should be more consistent with “ Inverse characteristic convolution ” This Chinese translation . There are also many small partners on the Internet “ Combined volume ” perhaps “ Internal volume ”.)
stay involution In the design of ,involution The nucleus is specifically located in the corresponding coordinates (i,j) The pixel Xi,j custom , But sharing on the channel ,G Calculate that each group shares the same Involution The number of nuclear groups . utilize Involution Check the input for multiplication and addition , obtain Involution Of out feature map. Unlike convolution kernels ,Involution nucleus H The shape of the image depends on the input feature map X The shape of the . The design idea is to generate a matrix conditional on the original input tensor Involution nucleus , Align the output core with the input core .
Fang Law :
First step : modify common.py, Definition Involution modular .
class Involution(nn.Module):
def __init__(self, c1, c2, kernel_size, stride):
super(Involution, self).__init__()
self.kernel_size = kernel_size
self.stride = stride
self.c1 = c1
reduction_ratio = 4
self.group_channels = 16
self.groups = self.c1 // self.group_channels
self.conv1 = Conv(
c1, c1 // reduction_ratio, 1)
self.conv2 = Conv(
c1 // reduction_ratio,
kernel_size ** 2 * self.groups,
1, 1)
if stride > 1:
self.avgpool = nn.AvgPool2d(stride, stride)
self.unfold = nn.Unfold(kernel_size, 1, (kernel_size - 1) // 2, stride)
def forward(self, x):
weight = self.conv2(self.conv1(x if self.stride == 1 else self.avgpool(x)))
b, c, h, w = weight.shape
weight = weight.view(b, self.groups, self.kernel_size ** 2, h, w).unsqueeze(2)
# out = _involution_cuda(x, weight, stride=self.stride, padding=(self.kernel_size-1)//2)
# print("weight shape:",weight.shape)
out = self.unfold(x).view(b, self.groups, self.group_channels, self.kernel_size ** 2, h, w)
# print("new out:",(weight*out).shape)
out = (weight * out).sum(dim=3).view(b, self.c1, h, w)
return outThe second step : take yolo.py Register in Involution modular .
The third step : change yaml file .
junction fruit : I have done a lot of experiments on multiple data sets , Different data sets have different improvement effects .
Let me know : Continue to share next YOLOv5 improvement . Interested friends can pay attention to me , If you have a problem, you can Leave a message or chat privately I oh ,csdn If you don't reply in time, please add WeChat official account : Artificial intelligence AI Algorithm engineer .
PS:Involution The method is not only applicable to improvement YOLOv5, You can also improve others YOLO The Internet , such as YOLOv4、v3 etc. .
Last , I hope I can powder each other , Be a friend , Learn and communicate together .
边栏推荐
- Leetcode's 82nd biweekly match
- Eureka Series : High-Speed Web Download Client
- Unity-2D像素晶格化消融
- 一场羽绒服直播GMV狂涨430%,反季热销的秘诀原来是这个?
- 如何使用Fiddler抓包某奇艺小程序视频下载
- Research on driverless dynamic obstacle avoidance strategy | robot dynamic obstacle avoidance strategy
- 1.4.2-sql injection defense bypass - Secondary encoding injection
- OpenGL es learning (5) - Lighting
- Trapped in the marketing siege, how long can Hua Xizi bear the title of "light of domestic goods"?
- (manual) [sqli labs48, 49] order by injection, blind injection, get injection
猜你喜欢
![[in-depth learning] experience of renting online equipment platform and the pits (non advertising)](/img/ea/1617f4135ddd7a8a3f30ab11edb637.png)
[in-depth learning] experience of renting online equipment platform and the pits (non advertising)

Codeforces Round #806 (Div. 4)(A.B.C.D.E.F)
![[C language] in depth understanding of conditional compilation](/img/5e/721d35938179ff59c5eb91c9a31794.jpg)
[C language] in depth understanding of conditional compilation

报错:cannot read properties of undefined(reading ‘forEach‘)

Unity-2d pixel lattice ablation

动态内存管理(c语言)

20、网络原理——基础概念

Economic dispatching of power system (complete code implementation in Matlab)
![[Xuelang download tutorial] 05 Xuelang download's official packet capture Download](/img/76/a8549c50918c355154ed33bc8b50f2.png)
[Xuelang download tutorial] 05 Xuelang download's official packet capture Download

Add, delete, check and modify the MySQL Learning Notes database (Advanced)
随机推荐
OpenGL es learning (3) -- Introduction to coloring language
wallys/Qualcomm IPQ8072A networking SBC supports dual 10GbE, WiFi 6
Fade in and fade out 1920-500 (8)
Codeforces Round #806 (Div. 4)(A.B.C.D.E.F)
netstat常用场景记录
Convolution structure and its calculation
1. Self made script language - Chapter 1 Notes
The first large-scale Chinese video multimodal similarity data set
图论
这些用例设计题,你在面试时遇到过吗?
go的命令行库--cobra使用
Nc20583 [sdoi2016] gear
Theoretical knowledge of static routing
小姐姐我来啦
OpenGL es learning (2) -- vertex shaders and slice shaders
[book club issue 13] +ffmpeg command
Why should V-for add key
【学浪下载教程】06学浪模拟网页版登录,无需客户端和全局代理软件
MySQL六十六问,两万字+五十图详解!有点六!
Alibaba cloud video on demand