基于主题加权LDA模型的情感分类方法
Emotion Classification Method Based on Topic Weighted LDA Model
-
摘要: 针对LDA(Latent Dirichlet Allocation)主题模型生成的大量topic,很大部分topic内部词语相关度很低,可解释性差,对语言模型后的应用效果带来一定的影响.针对这一问题,该文提出了一种基于主题加权LDA模型的情感分类方法,该模型实现不同主题中内部相关的词语特征加权计算,能够消除不同主题内具有相关度词语的相互影响.实验结果表明,与传统LDA模型分类方法对比,该文提出的基于主题加权LDA模型的情感分类方法平均F1值提高了6.7%~8.1%,验证了该文提出的方法是有效的,提高了分类效果.Abstract: For the large number of topics generated by the LDA (Latent Dirichlet Allocation) theme model, the relevance of the internal words is very low, poor interpretation, and the effect of the language model is affected. In order to solve this problem, an emotion classification method based on topic weighted LDA model has been proposed in this paper, which can realize the weighting calculation of words in different themes, and can eliminate the influence of words with relevance in different themes. The experimental results show that compared with the traditional LDA model classification method, the average F1 value of the emotion classification method based on the topic weighted LDA model is improved by 6.7%-8.1%, which proves that our proposed method is effective and improved classification effect.
-
Key words:
- LDA model /
- weighted feature /
- topic model /
- emotion classification .
-
-
[1] 尹书华.基于复杂网络的微博用户关系网络特性研究[J].西南师范大学学报(自然科学版),2011,36(6):57-61. [2] 孙平安,谭秋月.基于多属性决策理论的文本信息挖掘技术研究[J].西南师范大学学报(自然科学版),2016,41(11):155-159. [3] 李红波,孟欣赏,吴渝, 等.Web访问挖掘中的匿名用户识别算法研究[J].西南师范大学学报(自然科学版),2015, 40(9):78-84. [4] SOCHER R,PENNINGTON J,HUANG E H,et al.Semi-supervised Recursive Autoencoders for Predicting Sentiment Distributions[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing.Edinburgh:Association for Computational Linguistics,2011. [5] TAI K S,SOCHER R,MANNING C D.Improved Semantic Representations from Tree-Structured Long Short-Term Memory Networks[C]//Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing. Beijing:ACL,2015. [6] LIU Y,LI S,ZHANG X,et al.Implicit Discourse Relation Classification via Multi-Task Neural Networks[C]//Proceedings of the Thirtieth Conference on the Association for the Advance of Artificial Intelligence. Phoenix:AAAL,2016. [7] 程静,刘光远.基于情感心电信号的去趋势波动分析研究[J].西南大学学报(自然科学版),2016,38(2):169-175. [8] 刘真臻, 徐东平.微博个性化标签图形化RTM模型Gibbs采样推荐[J].微电子学与计算机, 2017, 34(12):138-144. [9] 张志昌,周慧霞,姚东任, 等.基于词向量的中文词汇蕴涵关系识别[J].计算机工程,2016,42(2):169-174. [10] 李湘东,高凡,丁丛.LD A模型下不同分词方法对文本分类性能的影响研究[J].计算机应用研究,2017,34(1):62-66. [11] 王见,陈义,邓帅.基于改进SVM分类器的动作识别方法[J].重庆大学学报(自然科学版),2016, 37(1):12-17. -
计量
- 文章访问数: 782
- HTML全文浏览数: 540
- PDF下载数: 81
- 施引文献: 0