Transformer文章阅读笔记
Vision Transformer With Deformable Attention. CVPR, 2022.
作者: Zhuofan Xia, Xuran Pan, Shiji Song, Li Erran Li, Gao Huang.
Remark: Deformable Attention Transformer. 提出一种可变形自注意力模块,key and value对儿以数据依赖的方式来选择。
DAT的在图像上的作用效果视觉展示如下图:
DAT中的deformable attention module结构如下图
其中的Off-set network如下图所示
完整的DAT网络架构如图下
Uformer: A General U-Shaped Transformer for Image Restoration
Zhendong Wang, Xiaodong Cun, Jianmin Bao, Wengang Zhou, Jianzhuang Liu, Houqiang Li
remark:
文章贡献:
- locally-enhanced window (LeWin) Transformer block,
- a learnable multi-scale restoration modulator in the form of a multi-scale spatial bias
数值实验测试的例子: - image denoising,
- motion deblurring,
- defocus deblurring and deraining.
code link
Uformer的结构图
Uformer主要构成部件 LeWin Transformer block的结构如下
Window-based multi-head self-attention (W-MHA)
StyleSwin: Transformer-Based GAN for High-Resolution Image Generation
Bowen Zhang, Shuyang Gu, Bo Zhang, Jianmin Bao, Dong Chen, Fang Wen, Yong Wang, Baining Guo
Remark:提出一种高分辨率图像生成模型。提出利用a wavelet discriminator解决生成图像有块状伪影。
实验结果展示
基础transformer block结构图
风格注入(stype injection)网络结构
双注意力结构,扩展视野
由于采用了window-based attention,生成图像产生块状伪影
多尺度小波谱辅助判别器设计