基于多任务学习的大五人格预测

doi:10.7523/j.issn.2095-6134.2018.04.019

中国科学院大学学报 ›› 2018, Vol. 35 ›› Issue (4): 550-560.DOI: 10.7523/j.issn.2095-6134.2018.04.019

基于多任务学习的大五人格预测

郑敬华¹, 郭世泽², 高梁², 赵楠³

1. 电子工程学院, 合肥 230037;
2. 北方电子设备研究所, 北京 100083;
3. 中国科学院心理研究所, 北京 100101

收稿日期:2017-03-02 修回日期:2017-05-04 发布日期:2018-07-15
通讯作者: 苏荔
基金资助:
省部级重大项目（AWS13J003）和国家自然科学基金（61602491）资助

Microblog users' Big-Five personality prediction based on multi-task learning

ZHENG Jinghua¹, GUO Shize², GAO Liang², ZHAO Nan³

1. Electronic Engineering Institute, Hefei 230037, China;
2. Institute of Northern Electronic Equipment, Beijing 100083, China;
3. Institute of Psychology, Chinese Academy of Sciences, Beijing 100101, China

Received:2017-03-02 Revised:2017-05-04 Published:2018-07-15

摘要/Abstract

摘要： 传统的社交网络用户的人格预测方法是采用单任务分类或回归的机器学习方法，这类方法忽略多个任务之间的潜在关联信息，并且在小规模训练数据条件下很难取得较好的预测效果。提出基于鲁棒多任务学习模型对微博用户进行大五人格的预测，既共享多个任务之间的关联信息，又能够识别出不相关任务。参数矩阵也相应地被分解为结构项和异常项，采用核范数和L₁/L₂范数进行正则项约束，将问题转化为求解优化问题。通过真实的新浪微博用户数据进行方法有效性的验证，5个维度的平均正确率、平均精确率和平均召回率分别达到67.3%、71.5%和74.6%，同时与在相同数据集上采取传统的单任务学习方法和多任务学习方法进行比较，结果表明本文提出的基于鲁棒多任务学习方法的预测效果优于其他几种方法。

关键词: 新浪微博, 人格预测, 多任务学习, 鲁棒性, 预测精度

Abstract: Most of traditional prediction methods of social network users' personality are based on single-task classification or regression machine learning. They ignore the potential related information between multiple tasks, and are very difficult to get admirable prediction results based on small scale training data. In this paper, a robust multi-task learning method (RMTL) is proposed to predict Big-Five personality of Microblog users, and it can not only share the task relations, but also identify irrelevant (outlier) tasks. The model is first decomposed into two components, i.e., a structure and an outlier, and then the nucleus norm and L₁/L₂ norm are used to constrain the regular term so as to solve the optimization problems. With Sina Microblog users' data, we validate the RMTL method, and the average correct rate, average precision rate, and average recall rate of the five dimensions are 67.3%, 71.5%, and 74.6%, respectively. The RMTL method outperforms the 4 single-task learning methods and the multi-task learning.

Key words: Sina microblog, personality prediction, multi-task learning, robust, prediction accuracy

中图分类号:

TN911.22

郑敬华, 郭世泽, 高梁, 赵楠. 基于多任务学习的大五人格预测[J]. 中国科学院大学学报, 2018, 35(4): 550-560.

ZHENG Jinghua, GUO Shize, GAO Liang, ZHAO Nan. Microblog users' Big-Five personality prediction based on multi-task learning[J]. , 2018, 35(4): 550-560.

参考文献

[1] Goldberg L R, Johnson J A, Eber H W, et al. The international personality item pool and future of public-domain personality measures[J]. Journal of Research in Personality, 2006,40(1):84-96.
[2] Ortigosa A, Carro R M, Quiroga J I. Predicting user personality by mining social interactions in Facebook[J]. Journal of Computer and System Sciences, 2013,80(1):57-71.
[3] Wald R, Khoshgoftaar T M, Napolitano A, et al. Using Twitter content to predict psychopathy[C]//Proceedings of the 201211th International Conference(ICMLA) on Machine Learning and Applications. USA, 2012:394-401.
[4] Li L, Li A, Hao B, et al. Predicting active users' personality based on micro-blogging behaviors[J]. Plos One, 2014,9(1):e84997.
[5] Wald R, Khoshgoftaar T M, Sumner C. Machine prediction of personality from Facebook profiles[C]//Proceedings of the 2012 IEEE 13^rd International Conference on Information Reuse and Integration. LasVegas, USA, 2012:109-115.
[6] Bachrach Y, Kosinski M, Graepel T, et al. Personality and patterns of Facebook usage[C]//Proceedings of the 3rd Annual ACM Web Science Conference. New York, USA, 2012:24-32.
[7] Iacobelli F, Gill A J, Nowson S, et al. Large scale personality classification of bloggers[C]//Fourth International Conference on Affective Computing & Intelligent Interaction. Memphis, USA,2011:568-577.
[8] Nowson S, Oberlander J. Identifying more bloggers:towards large scale personality classification of personal[C]//International Conference on Weblogs and Social. Colorado, USA, 2007:1-7.
[9] Caruana R. Multitask learning[J]. Machine Learning, 1997,28(1):41-75.
[10] Argyriou A, Evgeniou T, Pontil M. Convex multi-task feature learning[J]. Machine Learning, 2008,73(3):243-272.
[11] Ben-David S, Schuller-Borbely R. A notion of task relatedness yielding provable multiple-task learning guarantees[J]. Machine Learning, 2008, 73(3):273-287.
[12] Zhang Y, Yeung D Y. Multi-task learning using generalized t process[J]. Journal of Machine Learning Research Proceedings Track, 2010,9(1):964-971.
[13] Charuvaka A, Rangwala H. Classifying protein sequences using regularized multi-task learning[J]. IEEE/ACM transactions on computational biology and bioinformatics, 2014,11(6):1087-1098.
[14] Zhang J, Ghahramani Z, Yang Y. Learning multiple related tasks latent independent component analysis[J]. Advances in Neural Information Systems, 2006,18:1585-1592.
[15] Olshausen B A, Field D J. Emergence of simple-cell receptive field properties by learning a sparse code for natural images[J]. Nature, 1996,381(6583):607-609.
[16] Mao X, Wu O, Hu W, et al. Nonlinear Classification via linear SVMs and multi-task learning[C]//International Conference on Conference on Information & Knowledge Management. Shanghai, China, 2014:1955-1958.
[17] 白朔天,袁莎,程莉,等. 多任务回归在社交媒体挖掘中的应用[J].哈尔滨工业大学学报, 2014, 46(9):100-110.
[18] Evgeniou T, Pontil M. Regularized multi-task learning[C]//Proceedings of Knowledge Discovery and Data Mining. Washington, USA, 2004:109-117.
[19] Yu S, Tresp V, Yu K. Robust Multi-task Learning with t-Processes[C]//Proceedings of the 24th International Conference on Machine learning. Madison, USA, 2007:1103-1110.
[20] Chen J, Zhou J, Ye J. Integrating low-rank and group-sparse structures for robust multi-task learning[C]//Proceedings of the 10th ACM SIGKDD international conference on Knowledge discovery and data mining. California, USA, 2011:42-50.
[21] Xu H, Leng C. Robust multi-task regression with grossly corrupted observations[C]//Proceedings of the 15th International Conference on Artificial Intelligence and Statistics (AISTATS). La Palma, Canary Islands, 2012:1341-1349.
[22] Gong P, Ye J, Zhang C. Robust multi-task feature learning[C]//Knowledge Discovery and Data Mining International Conference'12.Beijing, China, 2012(8):895-903.
[23] Tibshirani R. Regression shrinkage and selection via the lasso[J]. Journal of the Royal Statistical Society, 2011,73(3):273-282.
[24] Ji S, Ye J. An accelerate gradient method for trace norm minimization[C]//Proceedings of the 26th Annual International Conference on Machine Learning. Montreal, Canada, 2009:457-464.

基于多任务学习的大五人格预测

Microblog users' Big-Five personality prediction based on multi-task learning

PDF (PC)

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 6

编辑推荐

Metrics

本文评价

访问统计

联系我们

[1]	王健飞, 张卫强, 刘加. 基于多状态跳转模型的场景独立音频事件检测方法[J]. 中国科学院大学学报, 2019, 36(2): 218-225.
[2]	白朔天, 郝碧波, 李昂, 聂栋, 朱廷劭. 微博用户的抑郁和焦虑预测[J]. 中国科学院大学学报, 2014, 31(6): 814-820.
[3]	陈肃, 罗铁坚, 许延祥. 基于信任的推荐算法的鲁棒性分析[J]. 中国科学院大学学报, 2011, 28(2): 253-261.
[4]	丁飞, 尹红霞. 改进的基于均衡约束数学规划的分类模型[J]. 中国科学院大学学报, 2009, 26(5): 599-608.
[5]	古今, 郭立, 郑东飞. 一种基于感知特性的鲁棒性语音认证算法[J]. 中国科学院大学学报, 2009, 26(4): 474-482.
[6]	王蜀泉; 赵光恒. 基于模糊控制的卫星大角度姿态机动控制方法研究[J]. 中国科学院大学学报, 2006, 23(1): 111-117.