人脸表情识别是深度学习领域的研究热点。在现实场景中,人脸图像的采集很容易受到外界不可控因素的影响,使表情图像出现轻微形变和局部位移的问题,导致表情识别率下降,难以满足实际需求。因此本设计针对静态人脸表情进行识别分类,提出了基于多通道输入的卷积神经网络(MCI-CNN)算法。
多通道输入的卷积神经网络(MCI-CNN)是由8个子模型构成。在测试过程中,将待测试的人脸图像经过随机裁剪、旋转等处理生成8张图片,并把这8张图片分别送入子模型中进行同时预测;将每个模型的预测结果,通过线性加权融合算法,得出最后的表情分类结果,提高了人脸图像轻微形变和局部位移的鲁棒性。每个子模型借鉴GoogleNet中1*1卷积的思想对AlexNet进行了改进,增加了模型的非线性表达能力;为了防止过拟合现象发生,对数据集进行数据增强处理和引入了dropout技术;针对模型训练效率的问题,采用了基于多线程随机shuffle队列的解决方法。
通过实验表明,MCI-CNN在fer2013表情库上取得了68.846%的识别率,比单个子模型多了2.7%的识别率,且识别率明显优于2013年人脸表情识别竞赛中其他方法,也印证了本设计算法的有效性。
关键词:人脸表情识别;卷积神经网络;MCI-CNN;TensorFlow
Abstract
Facial expression recognition is a research hotspot in the field of deep learning. In real-life scenarios, the collection of face images is easily affected by uncontrollable factors, causing slight deformation and local displacement of the expression images, resulting in a decrease in the expression recognition rate, which is difficult to meet the actual needs. Therefore, this design identifies and classifies static facial expressions, and proposes a multi-channel input convolutional neural network (MCI-CNN) algorithm.
The Multi-Channel Input Convolutional Neural Network (MCI-CNN) consists of eight sub-models. During the test, the face image to be tested is randomly cropped, rotated, etc. to generate 8 images, and the 8 images are respectively sent into the sub-model for simultaneous prediction; the prediction result of each model is linearized. The weighted fusion algorithm is used to obtain the final expression classification result, which improves the robustness of the slight deformation and local displacement of the face image. Each sub-model is improved by AlexNet's idea of 1*1 convolution in GoogleNet, which increases the nonlinear expression ability of the model; in order to prevent over-fitting, data enhancement processing and dropout technology are introduced; The problem of model training efficiency is based on a multi-threaded random shuffle queue.
Experiments show that MCI-CNN has achieved a recognition rate of 68.846% on the fer2013 expression database, 2.7% more recognition rate than the single sub-model, and the recognition rate is significantly better than other methods in the 2013 facial expression recognition competition. It also confirms the effectiveness of this design algorithm.
Key words: facial expression recognition; convolutional neural network; MCI-CNN; TensorFlow
目 录
摘 要 I
Abstract II
第一章 绪论 1
1.1 研究背景及意义 1
1.2 研究现状 1
1.3 本文研究的主要内容 3
第二章 人脸表情识别基础 4
2.1 常见的表情数据库 4
2.2 人脸识别 5
2.3 图像预处理 6
2.3.1 几何归一化 6
2.3.2 灰度归一化 8
2.3.3 直方图均衡化 8
2.4 主流人工智能框架 9
2.5 本章小结 10
第三章 多通道输入的卷积神经网络 12
3.1 CNN的基本原理 12
3.1.1 CNN的结构 12
3.1.2 激活函数 14
3.2 基于MCI-CNN的人脸表情识别算法 15
3.2.1 MCI-CNN网络架构 15
3.2.2 CNN模型结构及参数设置 16
3.3 本章小结 17
第四章 人脸表情识别的模型设计与分析 18
4.1 表情识别整体框图 18
4.2 数据集预处理 19
4.3 数据增强 20
4.4 模型训练的优化 23
4.4.1 TensorFlow读取数据的机制 23
4.4.2 基于多线程随机shuffle队列训练样本 24
4.5 实验结果分析与效果展示 25
4.6 本章小结 29
第五章 总结与展望 31
5.1 本文总结 31
5.2 未来展望 31
参考文献 33
致谢 35
附录 36
附录一 数据集预处理代码 36
附录二 CNN模型代码 38
附录三 模型训练代码 43
附录四 测试代码 48