《Utilizing Ensemble Learning for Detecting Multi-Modal Fake News》

系列论文研读目录

文章目录

系列论文研读目录
论文题目含义
ABSTRACT
INDEX TERMS
I. INTRODUCTION
II. RELATED WORK
- A. FAKE NEWS CLASSIFICATION APPROACHES FOR SINGLE-MODALITY 单模态虚假新闻分类方法
- - 1) SINGLE-MODALITY BASED CLASSIFICATION APPROACHES USING TEXTUAL FEATURES 基于单模态的文本特征分类方法
  - 2) SINGLE-MODALITY BASED CLASSIFICATION APPROACHES USING IMAGE FEATURES 基于单模态的图像特征分类方法
- B. FAKE NEWS CLASSIFICATION APPROACHES FOR MULTI-MODALITY 多模态虚假新闻分类方法研究
- - 1) PROBLEMS IN MULTI-MODALITY 多模态问题
  - 2) FLEXIBILITY IN MULTI-MODALITY 多模态灵活性
  - 3) IMPROVEMENT IN MULTI-MODALITY 多模式的改进
III. METHODOLOGY
- A. OVERVIEW 概述
- B. PROBLEM DEFINITION 问题定义
- C. PREPROCESSING 预处理
- D. SENTIMENTANALYSIS 情感分析
- E. FEATURE MODELING
- - 1) BERT SHARED LAYER BERT共享层
  - 2) IMAGE EMBEDDING LAYER 图像嵌入层
  - 3) PRE-FEATURE EXTRACTION 预特征提取
  - 4) MULTI-MODAL FEATURE CONCATENATION 多模态特征连接
- F.ENSEMBLE MODEL 集成模型
IV. EVALUATION 评价
- A. RESEARCH QUESTIONS (RQs)
- B. DATASET
- C. PROCESS 过程
- D. METRICS
- E. RESULTS
- - 1) RQ1: COMPARISON OF ELD-FN AGAINST BASELINE APPROACHES ELD-FN与基线方法的比较
  - 2) RQ2: INFLUENCE OF SENTIMENT ON ELD-FN 情绪对ELD-FN的影响
  - 3) RQ3: INFLUENCE OF PREPROCESSING ON ELD-FN 预处理对ELD-FN的影响
  - 4) RQ4: COMPARISON OF ELD-FN AGAINST OTHER CLASSIFIERS ELD-FN与其他分类器的比较
- F. THREATS TO VALIDITY 有效性威胁
V. CONCLUSION AND FUTURE WORK

文章链接

论文题目含义

利用集成学习检测多模态假新闻

ABSTRACT

The spread of fake news has become a critical problem in recent years due extensive use of social media platforms. False stories can go viral quickly, reaching millions of people before they can be mocked, i.e., a false story claiming that a celebrity has died when he/she is still alive. Therefore, detecting fake news is essential for maintaining the integrity of information and controlling misinformation, social and political polarization, media ethics, and security threats. From this perspective, we propose an ensemble learning-based detection of multi-modal fake news. First, it exploits a publicly available dataset Fakeddit consisting of over 1 million samples of fake news. Next, it leverages Natural Language Processing (NLP) techniques for preprocessing textual information of news. Then, it gauges the sentiment from the text of each news. After that, it generates embeddings for text and images of the corresponding news by leveraging Visual Bidirectional Encoder Representations from Transformers (V-BERT), respectively. Finally, it passes the embeddings to the deep learning ensemble model for training and testing. The 10-fold evaluation technique is used to check the performance of the proposed approach. The evaluation results are significant and outperform the state-of-the-art approaches with the performance improvement of 12.57%, 9.70%, 18.15%, 12.58%, 0.10, and 3.07 in accuracy, precision, recall, F1-score, Matthews Correlation Coefficient (MCC), and Odds Ratio (OR), respectively.近年来，由于社交媒体平台的广泛使用，假新闻的传播已成为一个严重问题。虚假的故事可以迅速传播，在被嘲笑之前就已经传播到数百万人手中，也就是说，一个虚假的故事，声称一个名人已经死了，而他/她还活着。因此，检测假新闻对于维护信息的完整性，控制错误信息，社会和政治两极分化，媒体道德和安全威胁至关重要。从这个角度出发，我们提出了一种基于集成学习的多模态假新闻检测方法。首先，它利用了一个由超过100万个假新闻样本组成的公开数据集Fakeddit。接下来，它利用自然语言处理（NLP）技术来预处理新闻的文本信息。然后，它从每条新闻的文本中衡量情绪。之后，它通过利用来自变压器的视觉双向编码器表示（V-BERT）分别为相应新闻的文本和图像生成嵌入。最后，它将嵌入传递给深度学习集成模型进行训练和测试。10倍评估技术被用来检查所提出的方法的性能。评价结果是显著的，并优于最先进的方法的性能提高12.57%，18.15%，12.58%，准确率，召回率，F1分数，马修斯修正相关系数（MCC）和比值比（OR）分别为12.57%，9.70%，0.10和3.07。

INDEX TERMS

Ensemble learning, convolutional neural network, multi-modal fake news, classification, boosted CNN, bagged CNN.
包围学习，卷积神经网络，多模态假新闻，分类，提升CNN，袋装CNN。

I. INTRODUCTION

The concept of fake news is not new. Its roots existed long ago in our society. It refers to false information which can be disseminated to mislead or deceive the Public. For example, fake news aboutCOVID-19 vaccines could discourage people from getting vaccinated, leading to increased rates of illness and death. In the past, every kind of distinct material was considered fake news, like satires, conspiracies, news manipulation, and click-bait. However, fake news is now becoming jargon [1] and has a huge impact on the critical events happening in our society, e.g., spreading fake news (false stories) on social media was very concerning in US presidential election 2016 [2].假新闻的概念并不新鲜。它的根源很久以前就存在于我们的社会。它是指可以传播误导或欺骗公众的虚假信息。例如，关于COVID-19疫苗的假新闻可能会阻碍人们接种疫苗，导致疾病和死亡率上升。在过去，每一种不同的材料都被认为是假新闻，比如讽刺、阴谋、新闻操纵和点击诱饵。然而，假新闻现在正成为行话[1]，并对我们社会中发生的重大事件产生巨大影响，例如，2016年美国总统大选期间，在社交媒体上传播假新闻（虚假故事）备受关注[2]。
Fake news can spread quickly through social media and other online platforms. It can have serious consequences, such as causing panic, influencing elections, and eroding public trust in legitimate news sources. Individuals need to distinguish real news and critically evaluate sources of information before sharing or responding to them. Additionally, news organizations and social media platforms are responsible for combating the spread of fake news by fact-checking and removing false content. The surveys show that about 70% of Americans use social media as a source of news and circulating information [3]. The accessibility of news and information on the Internet is very low-cost and convenient. However, spreading fake news on these carriers is straightforward and effortless [4]. Fake news can lead to false assumptions that drastically affect our society. Consequently, it is critical to design an automated fake news detection system.假新闻可以通过社交媒体和其他网络平台迅速传播。它可能会产生严重的后果，如引起恐慌，影响选举，以及侵蚀公众对合法新闻来源的信任。个人需要区分真实的新闻，并在分享或回应之前批判性地评估信息来源。此外，新闻机构和社交媒体平台有责任通过事实核查和删除虚假内容来打击假新闻的传播。调查显示，大约70%的美国人使用社交媒体作为新闻和传播信息的来源[3]。在互联网上获取新闻和信息的成本很低，也很方便。然而，在这些运营商上传播假新闻是直接和毫不费力的[4]。假新闻可能会导致错误的假设，严重影响我们的社会。因此，设计一个自动化的假新闻检测系统至关重要。
Many researchers are actively developing new and better methods for identifying and combating the spread of misinformation. Some of the key research areas and trends in this field include deep learning approaches, e.g., Convolutional Neural Network (CNN); linguistic features, e.g., sentiment analysis, topic modeling, and stylometric analysis; sourcebased approaches, e.g., analyzing the domain name, social media presence, or history of the news source, and ensemble approaches, e.g., combining linguistic, source-based, and deep learning models to create a more robust and accurate detection system. Although recent research has identified the issues of the said problem and proposed different solutions, e.g., pre-trained language models have shown their effectiveness in alleviating feature engineering efforts, such as Bidirectional Encoder Representations from Transformers (BERT) [5], OpenAI GPT [6], and Elmo [7], however; the problem requires significant performance improvement.许多研究人员正在积极开发新的和更好的方法来识别和打击错误信息的传播。该领域的一些关键研究领域和趋势包括深度学习方法，例如，卷积神经网络（CNN）;语言特征，例如，情感分析、主题建模和文体分析;基于源的方法，例如，分析新闻源的域名、社交媒体存在或历史，以及集成方法，例如，结合语言学、基于源代码和深度学习模型，以创建更强大、更准确的检测系统。尽管最近的研究已经确定了所述问题的问题并提出了不同的解决方案，例如，预训练的语言模型已经显示出它们在减轻特征工程工作方面的有效性，例如来自变压器的双向编码器表示（BERT）[5]，OpenAI GPT [6]和埃尔莫[7];然而，该问题需要显着的性能改进。
From this perspective, this paper proposes an ensemble learning-based detection of multi-modal fake news (ELDFN). It first exploits a publicly available dataset Fakeddit, a novel multi-modal dataset consisting of over 1 million samples from multiple categories of fake news. Second, it leverages Natural Language Processing (NLP) techniques for preprocessing textual information of news. Third, it gauges the sentiment from the text of each news. Fourth, it generates embeddings for text and images of the corresponding news by leveraging V-BERT [8], respectively. Finally, it passes the embeddings to the deep learning ensemble model for training and testing. The 10-fold evaluation technique is used to check the performance of ELD-FN. The evaluation results are significant and outperform the state-of-the-art approaches with the performance improvement of 12.57%, 9.70%, 18.15%, 12.58%, 0.10, and 3.07 in accuracy, precision, recall, F1-score, Matthews Correlation Coefficient (MCC), and Odds Ratio (OR), respectively.从这个角度出发，本文提出了一种基于集成学习的多模态假新闻检测（ELDFN）。它首先利用了一个公开的数据集Fakeddit，这是一个新颖的多模态数据集，由来自多个类别的假新闻的100多万个样本组成。其次，它利用自然语言处理（NLP）技术对新闻文本信息进行预处理。第三，它从每条新闻的文本中衡量情绪。第四，它通过利用V-BERT [8]分别为相应新闻的文本和图像生成嵌入。最后，它将嵌入传递给深度学习集成模型进行训练和测试。10倍评估技术被用来检查ELD-FN的性能。评价结果是显著的，并优于最先进的方法的性能提高12.57%，18.15%，12.58%，准确率，召回率，F1分数，马修斯修正相关系数（MCC）和比值比（OR）分别为12.57%，9.70%，0.10和3.07。
The main contributions made in this paper are as follows.本文的主要贡献如下。
• The proposed approach integrates news sentiment as a crucial feature and employs ensemble learning to identify multi-modal fake news.·所提出的方法将新闻情感作为一个重要特征，并采用集成学习来识别多模态假新闻。
• It is evident from the evaluation results that ELD-FN is significant and outperforms the baseline approaches with the performance improvement of 12.57%, 9.70%, 18.15%, 12.58%, 0.10, and 3.07 in accuracy, precision, recall, F1-score, MCC, and OR, respectively.·从评估结果可以看出，ELD-FN是显著的，并且优于基线方法，在准确率、精确率、召回率、F1分数、MCC和OR方面的性能分别提高了12.57%、9.70%、18.15%、12.58%、0.10和3.07。
The organization of the rest of the paper is as follows. Section III describes the details of ELD-FN. Section IV describes the evaluation methods for ELD-FN, obtained results, and their threats to validity. Section II discusses the research background. Section V summarizes the paper and suggests future work.本文其余部分的组织结构如下。第三节描述了ELD-FN的详细信息。第四节描述了ELD-FN的评价方法、获得的结果及其对有效性的威胁。第二部分论述了本文的研究背景。第五节总结了本文件，并提出了今后的工作建议。

II. RELATED WORK

Although extensive research on fake news detection has been performed [9], [10], [11], [12], [13], [14], [15], [16], [17], [18], [19], [20], [21], [22], [23], most research is conducted on textual data or uni-modal features. However, two most relevant researches [24], [25] proposed deep learning-based solutions for detecting fake news. The proposed approach (ELD-FN) differs from baseline approaches as it does not work for the multi-modal features but also considers the sentiments involved in the textual information of news.[10][11][12][13][14][15][16][17][18][19][20][21][22][23]虽然对假新闻检测进行了广泛的研究[9]，[10][11][12][13][14][15][16][17][18][19][20][21][22][23]大多数研究都是在文本数据或单峰特征上进行的。然而，两个最相关的研究[24]，[25]提出了基于深度学习的解决方案来检测假新闻。所提出的方法（ELD-FN）不同于基线方法，因为它不适用于多模态特征，但也考虑了新闻文本信息中涉及的情感。
Most of the state-of-the-art fake news classification approaches can be categorized as follows: 1) fake news classification approaches for single-modality and 2) fake news classification approaches for multi-modality.大多数最先进的假新闻分类方法可以分类如下：1）单模态的假新闻分类方法和2）多模态的假新闻分类方法。

A. FAKE NEWS CLASSIFICATION APPROACHES FOR SINGLE-MODALITY 单模态虚假新闻分类方法

The fake news classification approaches for single-modality can be further divided into two categories based on the text/image features.
单模态的假新闻分类方法可以进一步分为基于文本/图像特征的两类。

1) SINGLE-MODALITY BASED CLASSIFICATION APPROACHES USING TEXTUAL FEATURES 基于单模态的文本特征分类方法

Textual features can be divided into generic and latent categories. Usually, traditional machine learning algorithms utilize Generic textual features. These algorithms analyze text based on linguistic levels such as lexicon, syntax, discourse, and semantics. Previous research has compiled a detailed table summarizing these features [10]. However, Latent textual features consist of the embeddings extracted from textual data of news at the word, sentence, or document level. Latent vectors are constructed from the textual news data. Furthermore, these latent vectors are used as input for classifiers, i.e., SVM.语篇特征可分为语类特征和潜在语类特征。通常，传统的机器学习算法利用通用文本特征。这些算法基于诸如词汇、句法、话语和语义等语言层面来分析文本。之前的研究已经编制了一个详细的表格，总结了这些特征[10]。然而，潜在文本特征是从新闻文本数据中提取的词、句或文档级的嵌入。从文本新闻数据中构造潜在向量。此外，这些潜在向量被用作分类器的输入，即，支持向量机。
Recurrent neural networks (RNNs) are potent in modeling and analyzing sequential data. For example, Ma et al. used RNNs to capture relevant information over time by learning hidden layer representations [11]. Meanwhile, Chen et al. proposed a CNN-based approach for the classification [12]. Moreover, a novel technique Attention-Residual Network (ARC) is introduced to acquire long-range features. Ma et al. introduced a Generative Adversarial Network (GAN)-based model that employs a Generator network based on Gated Recurrent Units (GRU) to generate contentious instances. Furthermore, a Discriminator network based on RNNs is designed to identify essential features [13].递归神经网络（RNN）在建模和分析序列数据方面非常有效。例如，Ma等人使用RNN通过学习隐藏层表示来捕获随时间推移的相关信息[11]。同时，Chen等人提出了一种基于CNN的分类方法[12]。此外，一种新的技术注意力残差网络（ARC）被引入到获取远程功能。Ma等人介绍了一种基于生成对抗网络（GAN）的模型，该模型采用基于GRU的生成器网络来生成有争议的实例。此外，基于RNN的鉴别器网络被设计用于识别基本特征[13]。
RNN-based models have proven very effective in classifying fake news detection datasets. However, the RNN-based models prioritize the recent input sequence, and the essential features may be located at the end of the sequence. Yu et al. proposed a CNN-based approach that resolves this issue. The proposed technique does not prioritize recent input sequences. This approach applies feature extraction based on the relationship of the essential features [14]. Vaibhav and Hovy utilize a graphical approach for classifying news articles [15]. For this purpose, they used Graph Neural Networks, such as Graph Convolutional Networks (GCN) and Graph Attention Networks (GAT), to create graph embeddings for fake news detection.事实证明，基于RNN的模型在分类假新闻检测数据集方面非常有效。然而，基于RNN的模型优先考虑最近的输入序列，并且基本特征可能位于序列的末尾。Yu等人提出了一种基于CNN的方法来解决这个问题。所提出的技术不优先考虑最近的输入序列。这种方法基于基本特征的关系应用特征提取[14]。Vaibhav和Hovy利用图形方法对新闻文章进行分类[15]。为此，他们使用图神经网络，如图卷积网络（GCN）和图注意力网络（GAT），来创建用于假新闻检测的图嵌入。
Wu et al. utilize multi-task learning techniques to classify and detect fake news. Moreover, the stance classification task optimizes shared layers concurrently, improving news representations [16]. Cheng et al. utilized LSTM model to classify the textual news data. They used a variational autoencoder to extract essential textual features at the tweetlevel text.Wu等人利用多任务学习技术来分类和检测假新闻。此外，立场分类任务同时优化了共享层，改善了新闻表示[16]。Cheng等人利用LSTM模型对文本新闻数据进行分类。他们使用变分自动编码器来提取tweetlevel文本的基本文本特征。
Some researchers have assumed that complex and multi-dimensional news are not accessible initially. The accessibility of only text-based news depends on the popularity [17]. Qian et al. developed a text-based model that utilizes word/sentence level data from legitimate papers to produce user feedback for early detection [18]. This addressed the scarcity of user reviews as an auxiliary source of information. For example, Qian et al. proposed an approach for generating user feedback on the text. Such feedback was along with word/sentence level information from real articles for the classification process [18]. Giachanou et al. investigated the influence of emotional cues in the proposed model. They propose an LSTM model that integrates emotional signals extracted from claim texts to differentiate between true and false news [19].一些研究人员认为，复杂和多维的新闻最初是不可访问的。仅基于文本的新闻的可访问性取决于受欢迎程度[17]。Qian等人开发了一种基于文本的模型，该模型利用来自合法论文的单词/句子级别数据来产生用户反馈以进行早期检测[18]。这解决了缺乏用户评论作为辅助信息来源的问题。例如，Qian等人提出了一种生成用户对文本的反馈的方法。这种反馈沿着来自用于分类过程的真实的文章的单词/句子级信息[18]。Giachanou等人研究了情绪线索在所提出的模型中的影响。他们提出了一种LSTM模型，该模型集成了从索赔文本中提取的情感信号，以区分真假新闻[19]。

2) SINGLE-MODALITY BASED CLASSIFICATION APPROACHES USING IMAGE FEATURES 基于单模态的图像特征分类方法

As multimedia becomes more prevalent in social networks, news now contains text and visual information such as images and videos that convey rich meaning. However, textual feature-based approaches face challenges in effectively capturing visual information because of the heterogeneity between text and image data. Consequently, many researchers have proposed image-based approaches for detecting fake news.随着多媒体在社交网络中变得越来越普遍，新闻现在包含文本和视觉信息，例如传达丰富含义的图像和视频。然而，由于文本和图像数据之间的异构性，基于文本特征的方法在有效获取视觉信息方面面临挑战。因此，许多研究者提出了基于图像的方法来检测假新闻。
Classical image-based models utilized basic fundamental numerical features of images [20], [26], such as image count, popularity [27], and type to identify fake news. For impaired images, complex forensics features were extracted. Furthermore, post and user-based features are integrated to identify fake news [28]. However, it was evident that basic numerical features are inadequate to describe complex visual information of the news images.经典的基于图像的模型利用图像的基本数值特征[20]，[26]，例如图像计数，流行度[27]和类型来识别假新闻。对于受损图像，提取复杂的取证特征。此外，基于帖子和用户的功能被集成以识别假新闻[28]。然而，很明显，基本的数字特征是不足以描述复杂的视觉信息的新闻图像。
Deep learning models such as CNNs have proven effective in capturing visual features in news images. Many researches have shown that feature extraction from CNN models can be used in visual recognition tasks to generate generic image representation [29].事实证明，CNN等深度学习模型可以有效地捕捉新闻图像中的视觉特征。许多研究表明，从CNN模型中提取的特征可用于视觉识别任务，以生成通用图像表示[29]。
Building on the success of CNNs, recent studies have utilized pre-trained deep CNNs like VGG19 [30], [31] to obtain generic visual representations [32], [33]. Researchers suggested multi-domain visual neural models to capture the inherent traits of fabricated news images more effectively. These multi-domain models merged frequency and pixel domain visual data to differentiate between genuine and fabricated news based on visual characteristics [34]. Poor quality is a common trait in fake news images. The poor quality feature and image semantics are visible in frequency and pixel domains. However, the quality feature is extracted by CNN model, and the semantics of the images are extracted by CNN-RNN model.基于CNN的成功，最近的研究利用预训练的深度CNN，如VGG 19 [30]，[31]来获得通用的视觉表示[32]，[33]。研究人员提出了多域视觉神经模型，以更有效地捕捉捏造新闻图像的内在特征。这些多域模型合并了频率和像素域视觉数据，以基于视觉特征区分真实和捏造的新闻[34]。质量差是假新闻图像的一个共同特征。低质量的特征和图像语义在频域和像素域中是可见的。然而，质量特征是通过CNN模型提取的，而图像的语义是通过CNN-RNN模型提取的。

B. FAKE NEWS CLASSIFICATION APPROACHES FOR MULTI-MODALITY 多模态虚假新闻分类方法研究

Word-based and Image-based information are both important in detecting fake news. As social networks often contain both types of information, combining them can improve performance. This section discusses the different multi-modal approaches for fake news detection, categorized based on the different perspectives they adopt.基于文字和图像的信息在检测假新闻方面都很重要。由于社交网络通常包含这两种类型的信息，因此将它们结合起来可以提高性能。本节讨论了用于假新闻检测的不同多模态方法，并根据它们采用的不同视角进行分类。

1) PROBLEMS IN MULTI-MODALITY 多模态问题

Several studies have explored using visual information to complement textual information in detecting fake news. These studies typically use text-based and image-based encoders to extract textual and visual features, respectively. Furthermore, these feature vectors construct an overall feature vector for each news. For example, Wang et al. proposed Event classification as an additional task to enhance the generalizing ability ofthe model for event-invariant multimodal features [32]. Other researchers, such as Singhal et al., use a combination of text-based and image-based features. They utilize BERT and XLNet pre-trained models for encoding text-based and image-based data, respectively [35]. However, these approaches are proven to be limited in effectively detecting multi-modal fake news because of their ability to capture complex cross-modal correlations. More advanced multi-modal techniques are needed to improve the performance of fake news detection.一些研究探索了使用视觉信息来补充文本信息来检测假新闻。这些研究通常分别使用基于文本和基于图像的编码器来提取文本和视觉特征。此外，这些特征向量为每个新闻构建了一个整体特征向量。例如，Wang等人提出将事件分类作为一项额外任务，以增强模型对事件不变多模态特征的泛化能力[32]。其他研究人员，如Singhal等人，使用基于文本和基于图像的特征的组合。它们分别利用BERT和XLNet预训练模型对基于文本和基于图像的数据进行编码[35]。然而，这些方法在有效检测多模态假新闻方面被证明是有限的，因为它们能够捕获复杂的跨模态相关性。需要更先进的多模态技术来提高假新闻检测的性能。

2) FLEXIBILITY IN MULTI-MODALITY 多模态灵活性

Some studies have recognized that irrelevant images are a common characteristic of multi-modal fake news and have focused on measuring the consistency between the text and visual components in detection. One approach by Zhou and Zafarani [36] used an image captioning model to generate sentences from images and then measured the similarity between those sentences and the original text. However, this approach was constrained by the discrepancies that existed between the training data of the image captioning model and the real news corpus. Another approach by Xue et al. projected the visual and textual features into a shared feature space and computed the similarities between resulting multi-modal features. However, they encountered difficulties capturing multi-modal inconsistencies because of the semantic gap between the two types of features [37]. Ghorbanpour et al. [38] proposed the Fake-News-Revealer (FNR) method, which uses a Vision-transformer [39] and BERT [5] to extract image and text features respectively. The model extracted textual and visual features separately and determined their similarities by loss.一些研究已经认识到，不相关的图像是多模态假新闻的共同特征，并专注于测量检测中文本和视觉成分之间的一致性。Zhou和Zafarani [36]的一种方法使用图像字幕模型从图像中生成句子，然后测量这些句子与原始文本之间的相似性。然而，这种方法受到图像字幕模型的训练数据与真实的新闻语料之间存在差异的限制。Xue等人的另一种方法将视觉和文本特征投影到共享特征空间中，并计算所得多模态特征之间的相似性。然而，由于两种类型的特征之间的语义差距，他们遇到了捕获多模态不一致的困难[37]。Ghorbanpour等人。[38]提出了Fake-News-Revealer（FNR）方法，该方法使用视觉变换器[39]和BERT [5]分别提取图像和文本特征。该模型分别提取文本和视觉特征，并通过损失来确定它们的相似性。

3) IMPROVEMENT IN MULTI-MODALITY 多模式的改进

Several researchers have proposed different approaches for fake news detection using multi-modal data. Jin et al. utilized an RNN model and applied an attention mechanism to combine information extracted from textual, visual, and social context data [40]. Zhang et al. [41] used a multi-channel CNN with an attention mechanism to combine multi-modal information, while Song et al. [42] proposed the co-attention transformer to model the bidirectional enhancement between images and text. Qian et al. developed a Hierarchical Multi-modal Contextual Attention Network (HMCAN), which was designed to collectively capture multi-modal context data and the hierarchical semantics of text [43]. Wu et al. introduced the Multi-modal Co-Attention Network (MCAN) that extracts spatial-domain and frequency-domain features from the image and text, and fuses visual and textual features using multiple co-attention layers [44]. Other researchers have also utilized Graph Convolutional Networks (GCN) and entity-centric cross-modal interaction to model the relationship between word-based and imagebased objects. Finally, Zhang et al. and Laura et al. proposed a BERT-based multi-modal model to encode text-based and image-based information. The model effectively captures the interplay between text and images and employs contrastive learning to enhance multi-modal representations. [24], [45] integrated visual entities to enhance the comprehension of high-level semantics in news images and to model the inconsistencies and mutual enhancements of multi-modal entities [22].
In summary, when performing multi-modal fake news detection, there are three important inductive biases to consider when examining text-image correlations. Firstly, images provide additional information to the text, highlighting the need for multi-modal. Secondly, problems between text and images can serve as a potential signal for detecting fake news using multiple modalities. Finally, text-based and image-based data can improve performance by identifying essential features.总之，在执行多模态假新闻检测时，在检查文本-图像相关性时需要考虑三个重要的归纳偏差。首先，图像为文本提供了额外的信息，突出了多模态的必要性。其次，文本和图像之间的问题可以作为使用多种形式检测假新闻的潜在信号。最后，基于文本和基于图像的数据可以通过识别基本特征来提高性能。

III. METHODOLOGY

A. OVERVIEW 概述

The overview ofELD-FN is depicted in Fig. 1. The following are the main steps of ELD-FN.ELD-FN的概述如图1所示。以下是ELD-FN的主要步骤。

First, the publicly available multi-modal dataset (Fakeddit) is collected from Google Drive.
Next, it leverages NLP techniques, e.g., tokenization, stop-word removal, lowercase conversion, and lemmatization, for preprocessing textual information of news.
Then, it computes the sentiment from the text of each news.
After that, it generates embeddings for text and images of the corresponding news by leveraging V-BERT, respectively.
Finally, it passes the embeddings to the deep learning ensemble model for training and testing.
1)首先，从Google Drive收集公开可用的多模态数据集（Fakeddit）。
2)接下来，它利用NLP技术，例如，标记化、停用词去除、消隐转换和词形化，用于对新闻的文本信息进行预处理。
3)然后，它从每条新闻的文本中计算情感。
4)之后，它通过利用V-BERT分别生成相应新闻的文本和图像的嵌入。
5)最后，它将嵌入传递给深度学习集成模型进行训练和测试。

B. PROBLEM DEFINITION 问题定义

在这里插入图片描述
来自新闻N的多模态数据集集合的新闻n可以表示如下：
n =< t，i，s >（1）
其中，t是n的文本信息，i是n的图像，并且s是分配给n的状态，无论n是假还是真。
ELD-FN将新新闻的状态分为真或假，其中真表示新闻是真实的，假表示新闻是假的。因此，新新闻n的自动分类可以定义为映射f：
f：n → c c ∈ {true，false}，n ∈ N（2）
其中，c是来自新闻状态集合（true，false）的建议状态。

C. PREPROCESSING 预处理

The news may contain inappropriate and unnecessary text, e.g., English stop-words. Such information is considered an overhead for the machine learning classification algorithms because of processing time and memory utilization. Therefore, preprocessing of news text is essential for the performance ofELD-FN to make it fast and memory efficient. We perform the following preprocessing steps to clean the text of news.新闻可能包含不适当和不必要的文本，英语停用词。由于处理时间和存储器利用率的原因，这样的信息被认为是机器学习分类算法的开销。因此，对新闻文本的预处理对于ELD-FN的性能至关重要，以使其快速和高效地存储。我们执行以下预处理步骤来清理新闻文本。

TOKENIZATION 标记化
Text tokenization breaks down a piece of text into smaller units called tokens. Tokens are individual words, phrases, or other meaningful text elements, which can be analyzed and processed further. 文本标记化将一段文本分解为更小的单元，称为标记。标记是单个单词、短语或其他有意义的文本元素，可以进一步分析和处理。
SPECIAL CHARACTER REMOVAL 特殊字符删除
The text of news may contain special characters, e.g., semicolon (😉. This step removes the special characters from the list of tokens. 新闻文本可能包含特殊字符，例如，（;）.此步骤将从标记列表中删除特殊字符。
STOP-WORD REMOVAL 停用词删除
English text contains meaningless words that are used to make sentences meaningful, called stop-words. This step removes stop-words from the working list. 英语文本中包含一些无意义的词，这些词被用来使句子有意义，称为停用词。这一步从工作列表中删除停用词。
SPELL CORRECTION AND LOWERCASE CONVERSION 拼写纠正和小写转换
This step identifies and corrects the spelling mistakes from the working list of tokens of news.这一步识别和纠正拼写错误的工作列表的令牌的新闻。
LEMMATIZATION 内缩作用
The lemmatization step converts higher-degree and comparative words into their lower-degree words, e.g., lemmatization converts the word darker into dark.词形化步骤将较高程度的词和比较词转换成它们的较低程度的词，例如，词形还原将单词darker转换为dark。

We exploit Python Natural Toolkit (NLTK) for the preprocessing of news. The preprocessed news can be represented as follows:我们利用Python Natural Toolkit（NLTK）对新闻进行预处理。预处理后的新闻可以表示如下：
在这里插入图片描述

其中，t′ = t1，t2，…，tn是预处理后n的文本中的标记。

D. SENTIMENTANALYSIS 情感分析

Sentiment analysis is a NLP technique that involves identifying and extracting subjective information from text, i.e., opinions, attitudes, emotions, and sentiments towards a particular topic. It automatically classifies the polarity of a text as positive, negative, or neutral. We exploit TextBlob API3 for the computation of sentiment analysis. The news (mentioned in Eq. e3) after sentiment computation can be represented as follows:情感分析是一种NLP技术，涉及从文本中识别和提取主观信息，即，对特定主题的观点、态度、情感和情绪。它自动将文本的极性分类为积极，消极或中性。我们利用TextBlob API 3进行情感分析的计算。消息（在Eq.e3）在情感计算之后可以表示如下：
在这里插入图片描述
式中，v是n′的情绪。

E. FEATURE MODELING

This step passes the preprocessed text and images from the multi-modal dataset to V-BERT to generate the embeddings. V-BERT is an extension of the BERT model that combines the power of the BERT model with a visual grounding mechanism, allowing it to understand the relationship between the text and the visual information in an image. This is achieved by combining a region-based visual feature extractor with the BERT model, where each image region is encoded into a vector using a CNN. These visual features are concatenated with the input text, and the resulting sequence is fed into the BERT model. During training, V-BERT is optimized to minimize a joint loss function. This allows Visual BERT to learn language and vision representations in a unified framework and capture the complex interactions between the two modalities. The layers/steps involved in ELD-FN for identifying fake/real news.该步骤将来自多模态数据集的预处理文本和图像传递给V-BERT以生成嵌入。V-BERT是BERT模型的扩展，它将BERT模型的强大功能与视觉基础机制相结合，使其能够理解图像中文本和视觉信息之间的关系。这是通过将基于区域的视觉特征提取器与BERT模型相结合来实现的，其中每个图像区域都使用CNN编码为矢量。这些视觉特征与输入文本连接在一起，产生的序列被送入BERT模型。在训练期间，V-BERT被优化以最小化联合损失函数。这使得Visual BERT能够在统一的框架中学习语言和视觉表示，并捕捉两种模式之间的复杂交互。ELD-FN中用于识别假/真实的新闻的层/步骤。

1) BERT SHARED LAYER BERT共享层

在这里插入图片描述
对于新闻文本，BERT共享层使用预先训练的Seq2Seq模型实现[8]。为了取得更好的结果，微调学习过程是必要的，也是必不可少的。为了提高其效率，单独的BERT共享层采用模型到模型的文本特征。新闻文本特征提取器OT BERT的输出可以表示如下：
其中，BERTT是新闻文本的相关BERT共享层建模，XT是文本数据的输入表示。

2) IMAGE EMBEDDING LAYER 图像嵌入层

在这里插入图片描述

对于新闻图像，应用Faster-RCNN模型[8]来从图像中提取特征。检测到的对象可以提供整个画面的视觉上下文，并且通过详细的区域细节被链接到特定术语。我们还通过对目标位置进行编码，在图像中加入了位置嵌入特征。图像特征提取器OT BERT的输出可以表示如下：
其中，BERTI是图像的相关BERT共享层建模，而Xi是图像的输入表示。

3) PRE-FEATURE EXTRACTION 预特征提取

The BERT-shared layer is strong enough for feature extraction. It includes a pre-feature extractor to enhance the ability of BERT to learn semantic characteristics. Prefeature extractor consists of the Position-wise Convolution Transformation (PCT) and the Multi-Head Self-Attention (MSA) layer.BERT共享层足够强大，可以进行特征提取。它包括一个预特征提取器，以增强BERT学习语义特征的能力。预特征提取器由位置卷积变换（PCT）和多头自注意（MSA）层组成。

4) MULTI-MODAL FEATURE CONCATENATION 多模态特征连接

After extracting the latent features of text and image, these are concatenated together to obtain the desired multi-modal feature representations. The multi-modal concatenated features Of can be represented as follows:在提取文本和图像的潜在特征之后，将这些特征连接在一起以获得所需的多模态特征表示。多模态级联特征可以表示如下：
在这里插入图片描述

F.ENSEMBLE MODEL 集成模型

Bagging and boosting [46] are two approaches to ensemble machine learning models. We applied both approaches with CNNand LSTM models. Four different architectures (bagged CNN, bagged LSTM, boosted CNN, boosted LSTM) of ensemble machine learning models have experimented using bagging and bootstrap aggregating to predict the fake/real news. Note that bagged CNN is the proposed ensemble model as it yields the other mentioned ensemble architectures. The predictions through different architectures are made using Algorithm 1.Bagging和boosting [46]是集成机器学习模型的两种方法。我们将这两种方法应用于CNN和LSTM模型。集成机器学习模型的四种不同架构（bagged CNN，bagged LSTM，boosted CNN，boosted LSTM）已经使用bagging和bootstrap聚合来预测假/真实的新闻。请注意，Bagged CNN是建议的集成模型，因为它产生了其他提到的集成架构。使用算法1通过不同架构进行预测。
在这里插入图片描述

IV. EVALUATION 评价

This section constructs the research questions to evaluate ELD-FN, explains the exploited dataset, defines the metrics and evaluation process, and reports the findings and threats to validity.本节构建了评估ELD-FN的研究问题，解释了利用的数据集，定义了指标和评估过程，并报告了有效性的发现和威胁。

A. RESEARCH QUESTIONS (RQs)

The following research questions are investigated to evaluate ELD-FN.
• RQ1: Does ELD-FN outperform the baseline approaches?
• RQ2: Does news sentiment influence the identification of fake news?
• RQ3: Does preprocessing influence the identification of fake news?
• RQ4: Does ELD-FN outperform other classifiers regarding identifying fake news?
The RQ1 compares the ELD-FN with the baseline approaches [24], [25] names as FakeNED and MultiFND in the rest of this paper. The reason to select these approaches as baseline approaches is that both are recently proposed approaches, closely related to our work and exploited the same dataset.
The RQ2 investigates the influence of news sentiment to detect fake news. It evaluates whether positive news will likely be considered true or vice versa.
以下研究问题进行调查，以评估ELD-FN。
·RQ 1：ELD-FN是否优于基线方法？
·RQ 2：新闻情绪是否会影响假新闻的识别？
·RQ 3：预处理是否会影响假新闻的识别？
·RQ 4：ELD-FN在识别假新闻方面是否优于其他分类器？
RQ 1将ELD-FN与基线方法进行了比较[24]，[25]在本文的其余部分中命名为FakeNED和MultiFND。选择这些方法作为基线方法的原因是，这两种方法都是最近提出的方法，与我们的工作密切相关，并且利用了相同的数据集。
RQ 2调查了新闻情绪的影响，以检测假新闻。它评估正面消息是否可能被认为是真实的，反之亦然。
The RQ3 examines the impact of preprocessing the news text to detect fake news.
The RQ4 investigates the impact of different deep-learning classification algorithms on ELD-FN. We analyze the ELD-FN and other deep learning approaches to evaluate the performance of ELD-FN.
RQ 3研究了预处理新闻文本以检测假新闻的影响。
RQ 4研究了不同深度学习分类算法对ELD-FN的影响。我们分析了ELD-FN和其他深度学习方法，以评估ELD-FN的性能。

B. DATASET

The description of the exploited dataset of fake news Fakeddit is presented in Table 1 which is public (available online4). Nakamura et al. [47] collected the data from a social news and discussion website Reddit. It consists of over 1 million pieces of news (1,063,106) from 22 subreddits. It is classified in three different ways: 2-way, 3-way, and 6-way. The dataset samples with 6-way classification are represented in Fig. 2. Out of the total samples, 59.12% (628,501) and 40.48% (527,049) are fake and real news, correspondingly. However, only 64.25% (682,966) samples are multi-modal. Note that we only use the multi-modal data samples with 2-way classification to evaluate the proposed approach. Moreover, Fig. 3 and Fig. 4 represent the wordcloud (most common words in the dataset) and frequency ofthewords, respectively.表1中给出了对假新闻Fakeddit的利用数据集的描述，该数据集是公开的（可在线获得4）。中村等人[47]从社交新闻和讨论网站Reddit收集数据。它由来自22个子Reddit的超过100万条新闻（1，063，106条）组成。它分为三种不同的方式：2路，3路和6路。具有6向分类的数据集样本如图2所示。在全部样本中，59.12%（628，501）和40.48%（527，049）分别是假新闻和真实的新闻。然而，只有64.25%（682，966）的样本是多模态的。请注意，我们只使用具有双向分类的多模态数据样本来评估所提出的方法。此外，图3和图4分别表示单词云（数据集中最常见的单词）和单词的频率。
在这里插入图片描述

C. PROCESS 过程

This section explains the evaluation process of ELD-FN. After performing the preprocessing and feature modeling as mentioned in Section III, a 10-fold cross-validation technique is applied to train and test ELD-FN. The reason for considering 10-fold cross-validation is that it helps avoid data biasness and reduces the variance in performance estimation that might be observed with a single train-test split [48]. The dataset’s total multi-modal news N are broken down into ten (10) slices Ci, where i = 1, 2, . . . , 10. For each cross-validation, the slices of N are selected that are not from Ci as training samples (Nt ) and news from Ci as testing samples (Nv).
本节介绍ELD-FN的评估过程。在执行第III节中提到的预处理和特征建模之后，应用10重交叉验证技术来训练和测试ELD-FN。考虑10倍交叉验证的原因是，它有助于避免数据偏倚，并减少性能估计中的方差，这可能是使用单个训练-测试分裂观察到的[48]。数据集的全部多模态新闻N被分解成十（10）个片段Ci，其中i = 1，2，…，10个。对于每个交叉验证，选择不是来自Ci的N个切片作为训练样本（Nt），并选择来自Ci的新闻作为测试样本（Nv）。
A bit-by-bit evaluation process for ith cross-validation is as follows: 1) All news Nt from N but Ci are extracted and combined; 2) an ensemble deep learning classifier is trained on Nt ; 3) a CNN classifier is trained on Nt ; 4) a LSTM classifier is trained on Nt ; 5) baseline classifiers are trained on Nt ; 6) we predict whether each news from the testing samples Ci is real or fake; and 7) the below-mentioned evaluation metrics are computed for each classifier.用于第i个交叉验证的逐位评估过程如下：1）提取并组合除Ci之外的来自N的所有新闻Nt; 2）在Nt上训练集成深度学习分类器; 3）在Nt上训练CNN分类器; 4）在Nt上训练LSTM分类器; 5）在Nt上训练基线分类器; 6）预测来自测试样本Ci的每个新闻是真实的还是虚假的;以及7）为每个分类器计算下面提到的评估度量。

D. METRICS

We train and test the deep learning classifiers to evaluate the performance of ELD-FN. We select the most accepted metrics (accuracy, precision, recall, and f1-score) for this purpose. Furthermore, we compute theMCCandOR to check the effectiveness of the classifiers. The selected metrics can be presented as follows:通过训练和测试深度学习分类器来评估ELD-FN的性能。为此，我们选择了最可接受的指标（准确性、精确性、召回率和f1评分）。最后，通过计算MCC和OR来检验分类器的有效性。所选的度量可以如下所示：
在这里插入图片描述
where, TP and TN are the numbers of correctly predicted news as real and fake, respectively. Similarly, FP and FN are the numbers of incorrectly predicted news as real and fake, respectively.其中，TP和TN分别是被正确预测为真实的和虚假的新闻的数量。类似地，FP和FN分别是被错误预测为真实的和虚假的新闻的数量。

E. RESULTS

1) RQ1: COMPARISON OF ELD-FN AGAINST BASELINE APPROACHES ELD-FN与基线方法的比较

Table 2 and Fig. 5 present the evaluation metrics for three different approaches (ELD-FN, FakeNED, MultiFND) based on their accuracy, precision, recall, F1-score, MCC, and OR. The results advised that the average values of these metrics for ELD-FN, FakeNED, and MultiFND are (88.83%, 93.54%, 90.29%, 91.89%, 0.49, and 17.02), (89.25%, 91.12%, 87.54%, 89.29%, 0.45, and 15.78), and (78.91%, 85.27%, 76.42%, 80.60%, 0.39, and 13.95), respectively.
表2和图5给出了三种不同方法（ELD-FN，FakeNED，MultiFND）的评估指标，基于其准确度，精确度，召回率，F1分数，MCC和OR。结果表明，ELD-FN、FakeNED和MultiFND的这些指标的平均值为（88.83%、93.54%、90.29%、91.89%、0.49和17.02），（89.25%、91.12%、87.54%、89.29%、0.45和15.78）和（78.91%、85.27%、76.42%、80.60%、0.39和13.95）。
在这里插入图片描述

The f1-score distribution of cross-validation for ELDFN, FakeNED, and MultiFND are presented in Fig. 6. A beanplot is a visualization that displays a continuous variable’s distribution across different groups. The beanplot compares the f1-score distributions by plotting one bean for each approach. Across a bean, the width of the bean represents the density of the data, with wider beans indicating higher density.ELDFN、FakeNED和MultiFND交叉验证的f1分数分布见图6。豆形图是一种显示连续变量在不同组中分布的可视化。beanplot通过为每种方法绘制一个bean来比较f1分数分布。在bean中，bean的宽度表示数据的密度，更宽的bean表示更高的密度。
The following observations are made from Table 2, Fig. 5, and Fig. 6.以下观察结果来自表2、图5和图6。
在这里插入图片描述
• ELD-FN has the accuracy (88.83%) and highest precision (93.54%), indicating that it has the highest percentage of correctly classified instances and true positive instances. ELD-FN具有最高的准确率（88.83%）和最高的精确率（93.54%），表明它具有最高的正确分类实例和真阳性实例的百分比。
• ELD-FN has the highest recall (90.29%) and F1-score (91.89%), indicating that it has the highest ability to correctly identify positive instances and achieve a balance between precision and recall. ELD-FN具有最高的召回率（90.29%）和F1分数（91.89%），表明它具有最高的正确识别阳性实例的能力，并在精确度和召回率之间实现平衡。
• ELD-FN also has the highest MCC (0.49) and OR (17.02), indicating a better correlation between predicted and actual classifications and higher odds ofevent occurrence than FakeNED and MultiFND. The average results of MCC (0.49 > 0.45 > 0.39) > 0 and OR (17.02 > 15.78 > 13.95) > 1 are true for ELD-FN and confirm its effectiveness. ELD-FN也具有最高的MCC（0.49）和OR（17.02），表明预测和实际分类之间的相关性更好，事件发生的几率高于FakeNED和MultiFND。ELD-FN的MCC（0.49 > 0.45 > 0.39）> 0和OR（17.02 > 15.78 > 13.95）> 1的平均结果是正确的，证实了其有效性。
• The minimum f1-score of ELD-FN is higher than the maximum f1-scores ofFakeNED and MultiFND (shown in Fig. 6).ELD-FN的最小f1分数高于FakeNED和MultiFND的最大f1分数（如图6所示）。
To validate the significant difference in the means of performance (f1-score) for all iterations of ELD-FN, FakeNED, and MultiNED, we perform a single-factor Analysis of Variance (ANOVA). ANOVA is a statistical method used to test whether there is a significant difference in the means of three or more independent groups or samples. It is conducted on Excell with its default settings and presented in Fig. 7. It suggests that F > Fcric and p-value < (α = 0.05) are true for f1-score, and the factor (using different approaches) significantly differs in f1-score.为了验证ELD-FN、FakeNED和MultiNED的所有迭代的性能均值（f1评分）的显著差异，我们进行了单因素方差分析（ANOVA）。方差分析是一种统计方法，用于检验三个或三个以上独立组或样本的平均值是否存在显著差异。它是在Excell上进行的，具有默认设置，如图7所示。这表明F > FCric和p值<（α = 0.05）对f1评分是正确的，并且该因子（使用不同方法）在f1评分中存在显著差异。
在这里插入图片描述
Moreover, we utilize two re-sampling methods, oversampling and under-sampling to tackle the class imbalance within the dataset. Over-sampling involves generating additional samples for the minority class through RandomOverSampler, while under-sampling entails removing surplus records from the majority class in imbalanced datasets using RandomUnderSampler. The findings reveal that employing under-sampling results in accuracy, precision, recall, and F1-score values of 86.12%, 92.54%, 88.76%, and 90.61%, respectively. However, it’s important to note that under-sampling diminishes the number of majority class samples, leading to a loss of information. Consequently, the performance of both majority and minority classes in the fine-tuned BERT model declines when under-sampling is applied. Likewise, utilizing the over-sampling technique yields accuracy, precision, recall, and F1-score values of 90.26%, 94.37%, 91.88%, and 93.11%, respectively. This enhancement is attributed to BERT being exposed to a larger dataset, enabling it to learn meaningful patterns more effectively.此外，我们利用两种重采样方法，过采样和欠采样，以解决数据集内的类不平衡。过采样涉及通过RandomOverSampler为少数类生成额外的样本，而欠采样需要使用RandomUnderSampler从不平衡数据集中的多数类中删除多余的记录。结果表明，采用欠采样的准确率，精确率，召回率和F1值分别为86.12%，92.54%，88.76%和90.61%。然而，重要的是要注意，欠采样减少了多数类样本的数量，导致信息丢失。因此，当应用欠采样时，微调BERT模型中的多数类和少数类的性能都会下降。同样，利用过采样技术，准确率、精确率、召回率和F1得分值分别为90.26%、94.37%、91.88%和93.11%。这种增强归因于BERT暴露于更大的数据集，使其能够更有效地学习有意义的模式。
The preceding analysis concluded that ELD-FN outperforms the baseline approaches in detecting fake news.前面的分析得出结论，ELD-FN在检测假新闻方面优于基线方法。

2) RQ2: INFLUENCE OF SENTIMENT ON ELD-FN 情绪对ELD-FN的影响

The evaluation results ofELD-FN with and without sentiment analysis are presented in Table 3 and Fig. 8. The evaluation results of ELD-FN for different settings of sentiment (enable/disable) based on their accuracy, precision, recall, F1score, MCC, and OR are (88.83%, 93.54%, 90.29%, 91.89%, 0.49, and 17.02) and (88.12%, 90.38%, 89.98%, 90.17%, 0.49, and 17.02), respectively.表3和图8中给出了ELD-FN的评估结果（有情感分析和无情感分析）。ELD-FN对不同情感设置的评价结果（启用/禁用）基于其准确性，精度，召回率，F1评分，MCC和OR（88.83%，93.54%，90.29%，91.89%，0.49和17.02）和（88.12%，90.38%，89.98%，90.17%，0.49和17.02）。
From Table 3 and Fig. 8, it is observed that Disabling sentiment (i.e., textual features only) brings out the significant difference in precision from 93.54% to 90.38% and f1-score from 91.89% to 90.17%. However, MCC and OR remain the same.从表3和图8可以看出，禁用情绪（即，仅文本特征）带来了精确度从93.54%到90.38%的显著差异，并且F1得分从91.89%到90.17%。然而，MCC和OR保持不变。
Table 5 represents the relationship between sentiment and news. It presents that 65.84% of negative news are real, whereas only 34.16% of the positive news are real. However, 73.71% of negative news are fake, whereas only 26.29% of the positive news are fake. It means the possibility of spreading fake news is 180.37% = (73.71% - 26.29%) / 26.29%, if the news is negative. For example, if a fake news article portrays a political figure negatively, it can contribute to a negative sentiment towards that figure among the public and will propagate quickly.表5显示了情绪和新闻之间的关系。结果表明，65.84%的负面新闻是真实的，而只有34.16%的正面新闻是真实的。73.71%的负面新闻是假的，而只有26.29%的正面新闻是假的。这意味着，如果新闻是负面的，传播假新闻的可能性为180.37% =（73.71% - 26.29%）/26.29%。例如，如果一篇假新闻文章负面地描绘了一个政治人物，它可能会导致公众对该人物的负面情绪，并会迅速传播。
The preceding analysis concluded that sentiment and features are critical for detecting fake news and disabling either would significantly reduce the performance of ELD-FN.前面的分析得出结论，情感和特征对于检测假新闻至关重要，禁用其中任何一个都会显著降低ELD-FN的性能。

3) RQ3: INFLUENCE OF PREPROCESSING ON ELD-FN 预处理对ELD-FN的影响

The evaluation results of ELD-FN with and without preprocessing are presented in Table 4 and Fig. 9. The evaluation results of ELD-FN for different settings of preprocessing (enable/disable) based on their accuracy, precision, recall, F1score, MCC, and OR are (88.83%, 93.54%, 90.29%, 91.89%, 0.49, and 17.02) and (88.49%, 92.95%, 90.11%, 90.50%, 0.49, and 17.02), respectively.在表4和图9中给出了具有和不具有预处理的ELD-FN的评价结果。不同预处理设置的ELD-FN评价结果（启用/禁用）基于其准确性，精度，召回率，F1评分，MCC和OR（88.83%，93.54%，90.29%，91.89%，0.49和17.02）和（88.49%，92.95%，90.11%，90.50%，0.49和17.02）。
From Table 4 and Fig. 9, it is observed that disabling preprocessing brings out the significant difference in accuracy from 88.83% to 88.12%, precision from 93.54% to 92.95%, recall from 90.29 to 90.11, and f1-score from 91.89% to 90.50%. However, MCC and OR remain the same.从表4和图9中可以看出，禁用预处理带来了准确率从88.83%到88.12%，精确率从93.54%到92.95%，召回率从90.29到90.11以及f1分数从91.89%到90.50%的显著差异。然而，MCC和OR保持不变。
The preceding analysis concluded that text preprocessing and features are critical for detecting fake news and disabling either would significantly reduce the performance of ELD-FN.前面的分析得出结论，文本预处理和特征对于检测假新闻至关重要，禁用其中任何一个都会显著降低ELD-FN的性能。

4) RQ4: COMPARISON OF ELD-FN AGAINST OTHER CLASSIFIERS ELD-FN与其他分类器的比较

We select off-the-shelf deep learning classifiers (CNN and LSTM), the most widely adopted and well-known. Note that the preprocessed text, their sentiment, and feature embeddings are given as input to the selected classifiers for comparative analysis. We set hyper-parameters’ values as dropout = 0.2, recurrent_dropout = 0.2, loss function = binary-crossentropy, and activation = sigmoid for ELD-FN and both baseline approaches.我们选择了现成的深度学习分类器（CNN和LSTM），这是最广泛采用和最知名的。请注意，预处理后的文本、它们的情感和特征嵌入作为输入提供给所选分类器进行比较分析。对于ELD-FN和两种基线方法，我们将超参数的值设置为dropout = 0.2，recurrent_dropout = 0.2，损失函数=二进制交叉熵，激活= sigmoid。
Table 6 and Fig. 10 present the evaluation metrics for ELDFN, CNN, and LSTM based on their accuracy, precision, recall, F1-score, MCC, and OR. The results advised that the average values of these metrics for ELD-FN, FakeNED, and MultiFND are (88.83%, 93.54%, 90.29%, 91.89%, 0.49, and 17.02), (86.73%, 92.56%, 85.81%, 89.06%, 0.48, and 16.97), and (86.51%, 90.22%, 86.19%, 88.21%, 0.48, and 16.92), respectively.表6和图10根据准确度、精确度、召回率、F1分数、MCC和OR给出了ELDFN、CNN和LSTM的评估指标。结果表明，ELD-FN、FakeNED和MultiFND的这些指标的平均值为（88.83%、93.54%、90.29%、91.89%、0.49和17.02），（86.73%，92.56%，85.81%，89.06%，0.48和16.97）和（86.51%，90.22%，86.19%，88.21%，0.48和16.92）。
The following observations are made from Table 5 and Fig. 10.以下观察结果来自表5和图10。• ELD-FN outperforms CNN and LSTM. The performance enhancement ofELD-FN upon CNN in accuracy, precision, recall, f1-score, MCC, and OR is 2.42%, 1.06%, 5.22%, 3.18%, 0.01, and 0.05, respectively. However, the performance enhancement of ELD-FN upon LSTM in accuracy, precision, recall, f1-score, MCC, and OR is 2.68%, 3.68%, 4.76%, 4.17%, 0.01, and 0.10, respectively. • ELD-FN performs better than LSTM because LSTM requires short text and performs sequential processing, which is unnecessary in our case. In contrast, CNN is proven efficient for long text and works better to extract local invariant features.ELD-FN优于CNN和LSTM。ELD-FN在准确率、精确率、召回率、f1-score、MCC和OR方面的性能提升分别为2.42%、1.06%、5.22%、3.18%、0.01和0.05。然而，ELD-FN在准确率，精确率，召回率，f1分数，MCC和OR方面的性能增强分别为2.68%，3.68%，4.76%，4.17%，0.01和0.10。ELD-FN比LSTM表现得更好，因为LSTM需要短文本并执行顺序处理，这在我们的情况下是不必要的。相比之下，CNN被证明对长文本是有效的，并且在提取局部不变特征方面效果更好。
The preceding analysis concluded that ELD-FN outperforms other classifiers in detecting fake news.前面的分析得出结论，ELD-FN在检测假新闻方面优于其他分类器。

F. THREATS TO VALIDITY 有效性威胁

The probability of incorrect labeling of news is the first threat to construct validity. This research assumes that the assigned labels by Nakamura et al. [47] are correct. However, incorrect labeling of data may cause the productivity of ELD-FN.对新闻的错误标注概率是构念效度的第一个威胁。本研究假设中村等人[47]指定的标签是正确的。然而，数据的不正确标记可能会导致ELD-FN的生产率。
The choice of assessment metrics of ELD-FN is another threat to construct validity. The chosen metrics for detecting news are the most accepted in the literature for the classification task.ELD-FN评估指标的选择是另一个威胁到结构效度的因素。所选择的用于检测新闻的度量在分类任务的文献中是最被接受的。
The choice of the sentiment analysis repository is the first threat to internal validity. The chosen repository III-E has been public and has good results in computing sentiment. Exploiting other repositories may cause the productivity of ELD-FN.情感分析存储库的选择是内部有效性的第一个威胁。所选择的存储库III-E已经公开，并且在计算情感方面具有良好的效果。利用其他储存库可能会导致ELD-FN的生产力。
ELD-FN, FakeNED, and MultiFND coding is the second threat to internal validity. The coding and the produced results of ELD-FN, FakeNED, and MultiFND are verified to mitigate the threat. However, unknown errors may cause the productivity of ELD-FN.ELD-FN、FakeNED和MultiFND编码是对内部有效性的第二个威胁。ELD-FN、FakeNED和MultiFND的编码和生成结果经过验证，可以减轻威胁。然而，未知的错误可能会导致ELD-FN的生产力。
The hyper-parameters setting ofELD-FN is the third threat to internal validity. The hyper-parameters setting for ELD-FN is mentioned in Section IV-E4. The change in settings may cause the productivity of ELD-FN.ELD-FN的超参数设置是对内部效度的第三个威胁。ELD-FN的超参数设置见第IV-E4节。设置的变化可能会导致ELD-FN的生产率。

V. CONCLUSION AND FUTURE WORK

Automatic fake news detection is crucial to avoid spreading false information that can have serious consequences, ranging from reputational damage to social and political unrest. In some cases, fake news can even incite violence and lead to harm or loss of life. Therefore, the ability to automatically identify and flag false information can help mitigate the threats of fake news. From this perspective, this paper proposes an ensemble deep learning-based detection of fake news. The proposed approach leverages NLP techniques for preprocessing textual information of news, computes the sentiment from the text of each news, generates embeddings for text and images of the corresponding news by leveraging V-BERT, and passes the embeddings to the deep learning ensemble model for training and testing. The evaluation results significantly outperform the state-of-theart approaches in identifying fake news.自动虚假新闻检测对于避免传播可能造成严重后果的虚假信息至关重要，这些后果从声誉损害到社会和政治动荡。在某些情况下，假新闻甚至可能煽动暴力，导致伤害或生命损失。因此，自动识别和标记虚假信息的能力可以帮助减轻假新闻的威胁。从这个角度出发，本文提出了一种基于集成深度学习的假新闻检测方法。该方法利用NLP技术对新闻的文本信息进行预处理，从每个新闻的文本中计算情感，通过利用V-BERT为相应新闻的文本和图像生成嵌入，并将嵌入传递给深度学习集成模型进行训练和测试。评估结果显着优于国家的最先进的方法在识别假新闻。
In future, we would like to investigate the need to adapt detection algorithms to new types of media. Fake news is not limited to text-based content, and algorithms must be able to detect false information in images, videos, and audio as well. Moreover, we are interested in improving the interpretability of detection algorithms. Current methods often rely on opaque deep learning models, making it difficult to understand how decisions are being made. Future work could focus on developing more transparent models or tools that help users understand how algorithms arrive at their conclusions.在未来，我们想调查的需要，以适应检测算法的新类型的媒体。假新闻不仅限于基于文本的内容，算法还必须能够检测图像、视频和音频中的虚假信息。此外，我们有兴趣提高检测算法的可解释性。目前的方法通常依赖于不透明的深度学习模型，这使得人们很难理解决策是如何做出的。未来的工作可以集中在开发更透明的模型或工具，帮助用户了解算法如何得出结论。