欢迎访问中国科学院大学学报,今天是

中国科学院大学学报 ›› 2021, Vol. 38 ›› Issue (2): 181-188.DOI: 10.7523/j.issn.2095-6134.2021.02.004

• 数学与物理学 • 上一篇    下一篇

基于Group-Lasso方法的非均衡数据信用评分模型

韦勇凤, 向一波   

  1. 中国科学技术大学管理学院, 合肥 230026
  • 收稿日期:2019-05-17 修回日期:2019-07-08 发布日期:2021-03-15
  • 通讯作者: 向一波
  • 基金资助:
    安徽省自然科学基金(1808085MG222)资助

Imbalanced data credit scoring model based on Group-Lasso method

WEI Yongfeng, XIANG Yibo   

  1. School of Management, University of Science and Technology of China, Heifei 230026, China
  • Received:2019-05-17 Revised:2019-07-08 Published:2021-03-15

摘要: 目前商业银行面临的个人信用风险问题极其复杂,如何对个人信用风险进行管理非常重要。个人信用风险建模是其中很关键的一步。利用某商业银行信用卡数据,构建信用评分模型,预测客户的违约概率。通过采用ROSE(random over sampling examples)方法处理类别不均衡的问题,利用Group-Lasso(AUC准则)方法进行变量选择,构建基于Logistic回归的信用评分模型。实证结果表明,该方法对样本数据进行类别不均衡处理的结果比其他模型在判别能力和预测能力上更为有效。采用该方法所构建的模型能够作为客户信用评价决策的有效依据,指导银行及其他金融机构评估顾客个人信用风险,在实际运用中具有良好的可操作性。

关键词: 信用评分, Logistic回归, Group-Lasso方法, ROSE

Abstract: In view of the complexity of the customers' credit risk faced by commercial banks at the present, how to manage customers' credit risk is very important. Customers' credit risk modeling is a key step. We use the credit card data of a commercial bank to construct a credit scoring model and predict the default probability. We construct a credit scoring model on the basis of Logistic regression, using the group-Lasso (AUC criterion) method to select variables and using the ROSE (random over sampling examples) method to deal with the unbalanced categories. The results are compared and analyzed, and the new model constructed in this work has certain advantages in discriminating ability and predictive ability. It can play a guiding role for banks and other financial institutions in evaluating customer credit risk and can be used as an effective basis for customer credit evaluation decision. In practice, it also has good operability.

Key words: credit scoring, Logistic regression, Group-Lasso method, ROSE

中图分类号: