改进参考魔鬼导师:YOLOV7改进-添加注意力机制_哔哩哔哩_bilibili
视频教程:YOLOV7改进-添加注意力机制_哔哩哔哩_bilibili
GitHub改进项目地址:其中的cv_attentionGitHub - z1069614715/objectdetection_script: 一些关于目标检测的脚本的改进思路代码,详细请看readme.md
学习内容:
问:
涉及:
一些基本概念:通道数,卷积核大小,卷积步长:
一些基本概念_To-的博客-CSDN博客 中进行解释
根据视频--添加注意力机制的步骤:
1.打开Yolov7/cfg/training/yolov7.yaml 模型配置文件
yolov7.yaml配置文件中存储这yolov7模型的一百多层卷积结构,通过代码注释和模型结构图搭配理解
# parameters nc: 1 # number of classes depth_multiple: 1.0 # model depth multiple width_multiple: 1.0 # layer channel multiple# anchors anchors:- [12,16, 19,36, 40,28] # P3/8- [36,75, 76,55, 72,146] # P4/16- [142,110, 192,243, 459,401] # P5/32# yolov7 backbone backbone:# [from, number, module, args] 分别表示: 输入,重复次数,名称,参数数组[[-1, 1, Conv, [32, 3, 1]], # 0 ##CBS [-1, 1, Conv, [64, 3, 2]], # 1-P1/2 ##CBS [-1, 1, Conv, [64, 3, 1]], # ##CBS 实例:第一个-1表示以上一层的输出作为本层的输入,-1表示向前退一层,-2表示向前退两层,以此类推[-1, 1, Conv, [128, 3, 2]], # 3-P2/4 ##CBS # 实例: 第二个1表示,该层操作一次[-1, 1, Conv, [64, 1, 1]], # 实例: Conv表示该层操作的名称[-2, 1, Conv, [64, 1, 1]], # [64, 1, 1]: 64表示输出特征图的通道数,来控制特征图的深度[-1, 1, Conv, [64, 3, 1]], # [64, 3, 1]: 3表示卷积核大小,以3*3的卷积核对输入图像进行卷积,以提取图像特征[-1, 1, Conv, [64, 3, 1]], # [64, 3, 1]: 1表示卷积步长,步长控制了每次卷积操作后特征图的大小变化[-1, 1, Conv, [64, 3, 1]],[-1, 1, Conv, [64, 3, 1]],[[-1, -3, -5, -6], 1, Concat, [1]],[-1, 1, Conv, [256, 1, 1]], # 11 ##C7_1[-1, 1, MP, []],[-1, 1, Conv, [128, 1, 1]],[-3, 1, Conv, [128, 1, 1]],[-1, 1, Conv, [128, 3, 2]],[[-1, -3], 1, Concat, [1]], # 16-P3/8 ##MP-C3[-1, 1, Conv, [128, 1, 1]],[-2, 1, Conv, [128, 1, 1]],[-1, 1, Conv, [128, 3, 1]],[-1, 1, Conv, [128, 3, 1]],[-1, 1, Conv, [128, 3, 1]],[-1, 1, Conv, [128, 3, 1]],[[-1, -3, -5, -6], 1, Concat, [1]],[-1, 1, Conv, [512, 1, 1]], # 24 ##C7_1 待插入点[-1, 1, MP, []],[-1, 1, Conv, [256, 1, 1]],[-3, 1, Conv, [256, 1, 1]],[-1, 1, Conv, [256, 3, 2]],[[-1, -3], 1, Concat, [1]], # 29-P4/16 ##MP-C3[-1, 1, Conv, [256, 1, 1]],[-2, 1, Conv, [256, 1, 1]],[-1, 1, Conv, [256, 3, 1]],[-1, 1, Conv, [256, 3, 1]],[-1, 1, Conv, [256, 3, 1]],[-1, 1, Conv, [256, 3, 1]],[[-1, -3, -5, -6], 1, Concat, [1]],[-1, 1, Conv, [1024, 1, 1]], # 37 ##C7_1 待插入点[-1, 1, MP, []],[-1, 1, Conv, [512, 1, 1]],[-3, 1, Conv, [512, 1, 1]],[-1, 1, Conv, [512, 3, 2]],[[-1, -3], 1, Concat, [1]], # 42-P5/32 ##MP-C3[-1, 1, Conv, [256, 1, 1]],[-2, 1, Conv, [256, 1, 1]],[-1, 1, Conv, [256, 3, 1]],[-1, 1, Conv, [256, 3, 1]],[-1, 1, Conv, [256, 3, 1]],[-1, 1, Conv, [256, 3, 1]],[[-1, -3, -5, -6], 1, Concat, [1]],[-1, 1, Conv, [1024, 1, 1]], # 50 ##C7_1]# yolov7 head head:[[-1, 1, SPPCSPC, [512]], # 51 ##SPPCSPC 待插入点[-1, 1, Conv, [256, 1, 1]], ##CBS[-1, 1, nn.Upsample, [None, 2, 'nearest']], ##上采样[37, 1, Conv, [256, 1, 1]], # route backbone P4 ##CBS[[-1, -2], 1, Concat, [1]],[-1, 1, Conv, [256, 1, 1]],[-2, 1, Conv, [256, 1, 1]],[-1, 1, Conv, [128, 3, 1]],[-1, 1, Conv, [128, 3, 1]],[-1, 1, Conv, [128, 3, 1]],[-1, 1, Conv, [128, 3, 1]],[[-1, -2, -3, -4, -5, -6], 1, Concat, [1]],[-1, 1, Conv, [256, 1, 1]], # 63 ##C7_2[-1, 1, Conv, [128, 1, 1]],[-1, 1, nn.Upsample, [None, 2, 'nearest']],[24, 1, Conv, [128, 1, 1]], # route backbone P3[[-1, -2], 1, Concat, [1]],[-1, 1, Conv, [128, 1, 1]],[-2, 1, Conv, [128, 1, 1]],[-1, 1, Conv, [64, 3, 1]],[-1, 1, Conv, [64, 3, 1]],[-1, 1, Conv, [64, 3, 1]],[-1, 1, Conv, [64, 3, 1]],[[-1, -2, -3, -4, -5, -6], 1, Concat, [1]],[-1, 1, Conv, [128, 1, 1]], # 75 ##C7_2[-1, 1, MP, []],[-1, 1, Conv, [128, 1, 1]],[-3, 1, Conv, [128, 1, 1]],[-1, 1, Conv, [128, 3, 2]],[[-1, -3, 63], 1, Concat, [1]],[-1, 1, Conv, [256, 1, 1]],[-2, 1, Conv, [256, 1, 1]],[-1, 1, Conv, [128, 3, 1]],[-1, 1, Conv, [128, 3, 1]],[-1, 1, Conv, [128, 3, 1]],[-1, 1, Conv, [128, 3, 1]],[[-1, -2, -3, -4, -5, -6], 1, Concat, [1]],[-1, 1, Conv, [256, 1, 1]], # 88 ##C7_2[-1, 1, MP, []],[-1, 1, Conv, [256, 1, 1]],[-3, 1, Conv, [256, 1, 1]],[-1, 1, Conv, [256, 3, 2]],[[-1, -3, 51], 1, Concat, [1]],[-1, 1, Conv, [512, 1, 1]],[-2, 1, Conv, [512, 1, 1]],[-1, 1, Conv, [256, 3, 1]],[-1, 1, Conv, [256, 3, 1]],[-1, 1, Conv, [256, 3, 1]],[-1, 1, Conv, [256, 3, 1]],[[-1, -2, -3, -4, -5, -6], 1, Concat, [1]],[-1, 1, Conv, [512, 1, 1]], # 101 ##C7_2[75, 1, RepConv, [256, 3, 1]],[88, 1, RepConv, [512, 3, 1]],[101, 1, RepConv, [1024, 3, 1]],[[102,103,104], 1, IDetect, [nc, anchors]], # Detect(P3, P4, P5)]
相应的yolov7网络模型结构图:
在yolov7中的三个位置,如图所示:C7_1,C7_1,SPPCSPC后的三个特征层输出的位置添加注意力机制 = 等于对这三个模块用带有注意力机制的模块进行替换
在魔鬼老师的GitHub上找到要添加的注意力机制模块代码:GitHub - z1069614715/objectdetection_script: 一些关于目标检测的脚本的改进思路代码,详细请看readme.md