欢迎访问中国科学院大学学报,今天是

中国科学院大学学报 ›› 2018, Vol. 35 ›› Issue (4): 550-560.DOI: 10.7523/j.issn.2095-6134.2018.04.019

• 计算机科学 • 上一篇    下一篇

基于多任务学习的大五人格预测

郑敬华1, 郭世泽2, 高梁2, 赵楠3   

  1. 1. 电子工程学院, 合肥 230037;
    2. 北方电子设备研究所, 北京 100083;
    3. 中国科学院心理研究所, 北京 100101
  • 收稿日期:2017-03-02 修回日期:2017-05-04 发布日期:2018-07-15
  • 通讯作者: 苏荔
  • 基金资助:
    省部级重大项目(AWS13J003)和国家自然科学基金(61602491)资助

Microblog users' Big-Five personality prediction based on multi-task learning

ZHENG Jinghua1, GUO Shize2, GAO Liang2, ZHAO Nan3   

  1. 1. Electronic Engineering Institute, Hefei 230037, China;
    2. Institute of Northern Electronic Equipment, Beijing 100083, China;
    3. Institute of Psychology, Chinese Academy of Sciences, Beijing 100101, China
  • Received:2017-03-02 Revised:2017-05-04 Published:2018-07-15

摘要: 传统的社交网络用户的人格预测方法是采用单任务分类或回归的机器学习方法,这类方法忽略多个任务之间的潜在关联信息,并且在小规模训练数据条件下很难取得较好的预测效果。提出基于鲁棒多任务学习模型对微博用户进行大五人格的预测,既共享多个任务之间的关联信息,又能够识别出不相关任务。参数矩阵也相应地被分解为结构项和异常项,采用核范数和L1/L2范数进行正则项约束,将问题转化为求解优化问题。通过真实的新浪微博用户数据进行方法有效性的验证,5个维度的平均正确率、平均精确率和平均召回率分别达到67.3%、71.5%和74.6%,同时与在相同数据集上采取传统的单任务学习方法和多任务学习方法进行比较,结果表明本文提出的基于鲁棒多任务学习方法的预测效果优于其他几种方法。

关键词: 新浪微博, 人格预测, 多任务学习, 鲁棒性, 预测精度

Abstract: Most of traditional prediction methods of social network users' personality are based on single-task classification or regression machine learning. They ignore the potential related information between multiple tasks, and are very difficult to get admirable prediction results based on small scale training data. In this paper, a robust multi-task learning method (RMTL) is proposed to predict Big-Five personality of Microblog users, and it can not only share the task relations, but also identify irrelevant (outlier) tasks. The model is first decomposed into two components, i.e., a structure and an outlier, and then the nucleus norm and L1/L2 norm are used to constrain the regular term so as to solve the optimization problems. With Sina Microblog users' data, we validate the RMTL method, and the average correct rate, average precision rate, and average recall rate of the five dimensions are 67.3%, 71.5%, and 74.6%, respectively. The RMTL method outperforms the 4 single-task learning methods and the multi-task learning.

Key words: Sina microblog, personality prediction, multi-task learning, robust, prediction accuracy

中图分类号: