基于集成学习的微博用户转发行为预测Predicting microblog user retweet behaviors based on ensemble learning
张效尉,王伟,秦东霞
Zhang Xiaowei,Wang Wei,Qin Dongxia
摘要(Abstract):
为了提高微博用户转发行为预测的精度,提出一种有效的基于集成学习的微博用户转发行为预测算法.首先,对影响用户转发的各种特征进行综合分析,提取出用户属性、社交关系、微博内容等影响用户转发行为的特征;然后,采用Logistic回归、支持向量机与BP(BackPropagation)神经网络等机器学习算法对用户转发行为进行预测;最后,利用"加权投票法"的集成学习方法对多个预测结果进行融合.实验结果表明,相对于BP神经网络算法,在综合评价性能的F1度量值上,集成学习算法有1.5%的性能提升.
In order to improve the accuracy of predicting userretweet behaviors in a micorblog social network,the paper proposes an effective method based on Ensemble Learning.Firstly,the papercomprehensively analyzes the performances of various features that affect user retweetbehaviors,such asuser attributes,social relationships and microblog contents,et al.Based on the extracted features,the propsed method respectively predictsuser retweetbehaviors with Logistic regression,SVM(Support Vector Machine)and BP(Back Propagation)neural network,and incorporates the corresponding results in a weighted voting manner based on Ensemble Learning.The experimental results show that the propsed method has a performance improvement of 1.5% on F1 metric of the overall evaluation,compared with the BP neural network.
关键词(KeyWords):
新浪微博;转发行为预测;集成学习;社交关系
sinamicroblog;retweetbehavior prediction;ensemble learning;socialrelation
基金项目(Foundation): 国家自然科学基金(U1504602);; 河南省科技攻关项目(172102210089;162102210396);; 河南省自然科学基金研究项目(152300410129);; 河南省高等学校重点科研项目(15A520125;17A520019;15A520114)
作者(Author):
张效尉,王伟,秦东霞
Zhang Xiaowei,Wang Wei,Qin Dongxia
DOI: 10.16366/j.cnki.1000-2367.2018.02.018
参考文献(References):
- [1]Boyd D,Golder S,Lotan G.Tweet,tweet,retweet:Conversational aspects of retweeting on Twitter[C].Hawaii:Proceedings of the 43rd Hawaii International Conference on System Sciences,2010:1-10.
- [2]Suh B,Hong L,Pirolli P,et al.Want to be retweeted large scale analytics on factors impacting retweet in twitter network[C].Minneapolis:IEEE second International Conference on Social Computing,2010:177-184.
- [3]Yang Z,Guo J Y,CaiK K,et al.Understanding retweeting behaviors in social networks[C].Toronto:Proceedings of the 19th ACM International Conference on Information and Knowledge Management,2010:1633-1636.
- [4]张砀,路荣,杨青.微博客中转发行为的预测研究[J].中文信息学报,2012,26(4):109-114.
- [5]曹玖新,吴江林,石伟,等.新浪微博网信息传播分析与预测[J].计算机学报,2014,37(4):780-790.
- [6]孔庆超,毛文吉,张育浩.社交网站中用户评论行为预测[J].智能系统学报,2015,10(3):1-5.
- [7]Tang X,Miao Q G,Quan Y N,et al.Predicting individual retweet behavior by user similarity:A Multi-Task Learning Approach[J].Knowledge-Based Systems,2015,89:681-688.
- [8]Zhang J,Tang J,Li J Z,et al.Who influenced you?predicting retweet via social influence locality[J].ACM Transactions on Knowledge Discovery from Data,2015,9(3):1-26.
- [9]Wang Z F,Yang Y,Pei J,et al.Activity maximization by effective information diffusion in social networks[J].IEEE Transactions on Knowledge and Data Engineering,2017,29(11):2374-2387.
- [10]刘立波,任静,周杰,等.基于句型分类的清真食品评论倾向性判别[J].河南师范大学学报(自然科学版),2017,45(4):97-102.
- [11]王冲,纪仙慧.基于用户兴趣与主题相关的PageRank算法改进研究[J].计算机科学,2016,43(3):275-277.