相关论文:
LeNet:Handwritten Digit Recognition with a Back-Propagation Network;
Gradient-Based Learning Applied to Document Recognition(CNN的起点);
AlexNet:ImageNet Classification with Deep Convolutional Neural Networks(奠定CNN的基础);
OverFeat:OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks;
ZFNet:isualizing and Understanding Convolutional Networks(在AlexNet基础上做可视化、可解释
相关工作);
VGG:VERY DEEP CONVOLUTIONAL NETWORKS FOR LARGE-SCALE IMAGE RECOGNITION(将模块堆叠到极致);
Inception V1/GoogLeNet:Going deeper with convolutions(开始剑走偏锋,提出一些非常规的分解、并行模块,Inception架构的基础);
BN-Inception:Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift(Inception+Batch Normalization);
Inception V2/Inception V3:Rethinking the Inception Architecture for Computer Vision(上承Inception-V1,下启Inception-V4和Xception,继续对模块进行分解);
Inception-V4, Inception-ResNet:Inception-V4, Inception-ResNet and the Impact of Residual Connections on Learning(纯Inception block、结合ResNet和Inception);
Xception:Deep Learning with Depthwise Separable Convolutions(Xception:extreme inception,分解到极致的Inception);
ResNet V1:Deep Residual Learning for Image Recognition(何凯明,提出残差连接概念 ResNet系列开山之作);
ResNet V2:Identity Mappings in Deep Residual Networks(何凯明,在V1的基础上进行改进,和V1同一个作者);
DenseNet:Densely Connected Convolutional Networks;
ResNeXt:Aggregated Residual Transformations for Deep Neural Networks(何凯明团队);
DualPathNet:Dual Path Networks;
SENet:queeze-and-Excitation Networks(提出SE模块,可以便捷的插入其他网络,由此有了一系列SE-X网络);
Res2Net:Res2Net: A New Multi-scale Backbone Architecture;
ResNeSt:ResNeSt:Split-Attention Networks(集大成者);
NAS:NEURAL ARCHITECTURE SEARCH WITH REINFORCEMENT LEARNING(神经网络搜索的开山作之 有人工智能设计网络);
NASNet:Learning Transferable Architectures for Scalable Image Recognition(将预测Layer参数改为预测block参数);
MnasNet:Platform-Aware Neural Architecture Search for Mobile(适用于算力受限的设备——移动端等);
MobileNets系列:
MobileNet V1: Efficient Convolutional Neural Networks for Mobile Vision Applications;
MobileNetV2:Inverted Residuals and Linear Bottlenecks;
MobileNetV3:Searching for MobileNetV3(用人工智能搜索出的架构);
SqueezeNet:ALEXNET-LEVEL ACCURACY WITH 50X FEWER PARAMETERS AND <0.5MB MODEL SIZE(与AlexNet同等精度,参数量比AlexNet小50倍,模型尺寸< 0.5MB的网络);
ShuffleNet V1:ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices;
ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design;
EfficientNet V1:EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks;
EfficientNetV2: Smaller Models and Faster Training;
Transformer:Attention Is All You Need(开山之作);
ViT:AN IMAGE IS WORTH 16X16 WORDS: TRANSFORMERS FOR IMAGE RECOGNITION AT SCALE(transformer在CV领域应用的里程碑著作);
Swin:Swin Transformer: Hierarchical Vision Transformer using Shifted Windows(视觉Transformer);
VAN:Visual Attention Network(不是Transformer、只是将Transformer的思想借鉴入CNN中);
PVT:Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions(金字塔结构+Transformer);
TNT:Transformer in Transformer;
MLP-Mixer:MLP-Mixer: An all-MLP Architecture for Vision;
ConvMixer:ConvMixer:Patches Are All You Need( 证明 ViT 性能主要归因于使用Patchs作为输入表示的假设);