基于文本增强的共注意机制的多模态标签推荐-陕西师范大学学报期刊社网站

陕西师范大学学报（自然科学版）

人工智能专题

基于文本增强的共注意机制的多模态标签推荐

冯皓楠,何智勇,马良荔*

（海军工程大学电子工程学院, 湖北武汉 430000）

马良荔, 女,教授, 博士生导师, 主要从事系统结构、系统可靠性等方面研究。E-mail:maliangl@163.com

摘要:

针对新型社交平台用户发布帖子时通常会使用标签来标记帖子的关键词或话题来提高自己在社交媒体中参与度的问题，使用了层级结构，从单词、短语和句子三个层级来提取文本特征。并且提出文本内容的汇总注意机制，将每个层级的语义内容总结为一个特征向量，然后提出一个文本增强的共注意模型，将每个层级的语义分别与图像模态进行语义融合。同时，考虑到不同用户使用标签的偏好习惯等各不相同，引入一个外部存储单元来记录每个用户的历史标签习惯，计算当前待推荐帖子与历史帖子之间的相似度影响向量，建立用户的个性化模块。在真实数据集上的实验结果表明，文中基于多模态帖子内容理解和个性化模块分析模型相比与其他模型，在精确率、召回率和F1分数上都有很大提升；提出的两个关于多模态内容理解的注意力机制和用户的个性化建模都对整体推荐效果有显著的贡献。

关键词：

文本层级建模；共注意机制；文本注意机制；多模态推荐；个性化推荐

收稿日期：

2022-02-03

中图分类号：

TP391

文献标识码：

文章编号：

1672-4291(2023)05-0060-07

基金项目：

十三五预研项目（41412010801）

Doi:

10.15983/j.cnki.jsnu.2023027

Multi-modal label recommendation based on text-enhanced co-attention mechanism

FENG Haonan, HE Zhiyong, MA Liangli*

(School of Electronic Engineering, Naval University of Engineering, Wuhan 430000, Hubei, China)

Abstract:

In new social platforms, users usually use hashtags to mark the keywords or topics of the posts when posting posts, which will increase their participation in social media. In this article, considering that the text of the users post can better express the users own thoughts, a hierarchical structure is used to extract text features from the three levels of words, phrases, and sentences, and propose a summary attention mechanism for the text content. The semantic content of each level is summarized as a feature vector, and then a text-enhanced common attention model is proposed to merge the semantics of each level with the image modal. At the same time, considering that different users have different hashtag preferences, an external storage unit is introduce to record the historical hashtag habits of each user, calculate the similarity influence vector between the current post to be recommended and the historical post, and establish the user personalized module. The overall hashtag recommendation results are generated based on the analysis of multi-modal post content understanding and personalized modules. Experimental results on real data sets show that our model has a great improvement in accuracy, recall and F1 score compared with other models, the two attention mechanisms for multi-modal content understanding and the users personalized modeling proposed in this paper all contribute significantly to the overall recommendation effect.

KeyWords:

text hierarchical modeling; co-attention mechanism; text attention mechanism; multimodal recommendation; personalized recommendation